Wireless communication systems and methods for long-code communications for regenerative multiple user detection involving matched-filter outputs

ABSTRACT

The invention provides improved CDMA, WCDMA (UTMS) or other spread spectrum communication systems of the type that processes one or more spread-spectrum waveforms, each representative of a waveform received from a respective user (or other transmitting device). The improvement is characterized by a first logic element that generates a residual composite spread-spectrum waveform as a function of an arithmetic difference between a composite spread-spectrum waveform for all users (or other transmitters) and an estimated spread-spectrum waveform for each user. It is further characterized by one or more second logic elements that generate, for at least a selected user (or other transmitter), a refined spread-spectrum waveform as a function of a sum of the residual composite spread-spectrum waveform and the estimated spread-spectrum waveform for that user.

This application claims the benefit of priority of (i) U.S. ProvisionalApplication Ser. No. 60/275,846 filed Mar. 14, 2001, entitled “ImprovedWireless Communications Systems and Methods”; (ii) U.S. ProvisionalApplication Ser. No. 60/289,600 filed May 7, 2001, entitled “ImprovedWireless Communications Systems and Methods Using Long-Code Multi-UserDetection” and (iii) U.S. Provisional Application Ser. No. 60/295,060filed Jun. 1, 2001 entitled “Improved Wireless Communications Systemsand Methods for a Communications Computer,” the teachings all of whichare incorporated herein by reference.

BACKGROUND OF THE INVENTION

The invention pertains to wireless communications and, moreparticularly, by way of example, to methods and apparatus providingmultiple user detection for use in code division multiple access (CDMA)communications. The invention has application, by way of non-limitingexample, in improving the capacity of cellular phone base stations.

Code-division multiple access (CDMA) is used increasingly in wirelesscommunications. It is a form of multiplexing communications, e.g.,between cellular phones and base stations, based on distinct digitalcodes in the communication signals. This can be contrasted with otherwireless protocols, such as frequency-division multiple access andtime-division multiple access, in which multiplexing is based on the useof orthogonal frequency bands and orthogonal time-slots, respectively.

A limiting factor in CDMA communication and, in particular, in so-calleddirect sequence CDMA (DS-CDMA) communication, is the interferencebetween multiple cellular phone users in the same geographic area usingtheir phones at the same time, which is referred to as multiple accessinterference (MAI). Multiple access interference has an effect oflimiting the capacity of cellular phone base stations, driving servicequality below acceptable levels when there are too many users.

A technique known as multi-user detection (MUD) is intended to reducemultiple access interference and, as a consequence, increases basestation capacity. It can reduce interference not only between multipletransmissions of like strength, but also that caused by users so closeto the base station as to otherwise overpower signals from other users(the so-called near/far problem). MUD generally functions on theprinciple that signals from multiple simultaneous users can be jointlyused to improve detection of the signal from any single user. Many formsof MUD are discussed in the literature; surveys are provided in Moshavi,“Multi-User Detection for DS-CDMA Systems,” IEEE Communications Magazine(October, 1996) and Duel-Hallen et al, “Multiuser Detection for CDMASystems,” IEEE Personal Communications (April 1995). Though a promisingsolution to increasing the capacity of cellular phone base stations, MUDtechniques are typically so computationally intensive as to limitpractical application.

An object of this invention is to provide improved methods and apparatusfor wireless communications. A related object is to provide such methodsand apparatus for multi-user detection or interference cancellation incode-division multiple access communications.

A further related object is to provide such methods and apparatus asprovide improved short-code and/or long-code CDMA communications.

A further object of the invention is to provide such methods andapparatus as can be cost-effectively implemented and as require minimalchanges in existing wireless communications infrastructure.

A still further object of the invention is to provide methods andapparatus for executing multi-user detection and related algorithms inreal-time.

A still further object of the invention is to provide such methods andapparatus as manage faults for high-availability.

SUMMARY OF THE INVENTION

Wireless Communication Systems And Methods For Long-code CommunicationsFor Regenerative Multiple User Detection Involving Implicit WaveformSubtraction

The foregoing and other objects are among those attained by theinvention which provides, in one aspect, an improved spread-spectrumcommunication system of the type that processes one or morespread-spectrum waveforms, e.g., a CDMA transmissions, eachrepresentative of a waveform received from, or otherwise associatedwith, a respective user (or other transmitting device). The improvementis characterized by a first logic element, e.g., operating inconjunction with a wireless base station receiver and/or modem, thatgenerates a residual composite spread-spectrum waveform as a function ofa composite spread-spectrum waveform and an estimated compositespread-spectrum waveform. It is further characterized by one or moresecond logic elements that generate, for at least a selected user (orother transmitter), a refined matched-filter detection statistic as afunction of the residual composite spread-spectrum waveform generated bythe first logic element and a characteristic of an estimate of theselected user's spread-spectrum waveform.

Related aspects of the invention as described above provide a system asdescribed above in which the first logic element comprises arithmeticlogic that generates the composite spread-spectrum waveform based on arelationr _(res) ^((n)) [t]≡r[t]−{circumflex over (r)} ^((n)) [t]wherein

-   -   r_(res) ^((n))[t] is the residual composite spread-spectrum        waveform,    -   r[t] represents the composite spread-spectrum waveform,    -   {circumflex over (r)}^((n))[t] represents the estimated        composite spread-spectrum waveform,    -   t is a sample time period, and    -   n is an iteration count,

The estimated composite spread-spectrum waveform, according to furtherrelated aspects, can be pulse-shaped and based on estimated complexamplitudes, estimated symbols, and codes encoded within the userwaveforms.

Still further aspects of the invention provide improved spread-spectrumcommunication systems as described above in which the one or more secondlogic elements comprise rake logic and summation logic, which generatethe refined matched-filter detection statistic for at least the selecteduser based on a relationy _(k) ^((n+1)) [m]=A _(k) ^((n)) ² ·{circumflex over (b)} _(k) ^((n))[m]+y _(res,k) ^((n)) [m]wherein

-   -   A_(k) ^((n)) ² represents an amplitude statistic,    -   {circumflex over (b)}_(k) ^((n))[m] represents a soft symbol        estimate for the k^(th) user for the m^(th) symbol period,    -   y_(res,k) ^((n))[m] represents a residual matched-filter        detection statistic for the k^(th) user, and    -   n is an iteration count.

Further related aspects of the invention provide improved systems asdescribed above wherein the refined matched-filter detection statisticsfor each user is iteratively generated. Related aspects of the inventionprovide such systems in which the user spread-spectrum waveform for atleast a selected user is generated by a receiver that operates onlong-code CDMA signals.

Further aspects of the invention provide a spread spectrum communicationsystem, e.g., of the type described above, having a first logic elementwhich generates an estimated composite spread-spectrum waveform as afunction of estimated user complex channel amplitudes, time lags, anduser codes. A second logic element generates a residual compositespread-spectrum waveform a function of a composite user spread-spectrumwaveform and the estimated composite spread-spectrum waveform. One ormore third logic elements generate a refined matched-filter detectionstatistic for at least a selected user as a function of the residualcomposite spread-spectrum waveform and a characteristic of an estimateof the selected user's spread-spectrum waveform.

A related aspects of the invention provides such systems in which thefirst logic element generates the estimated re-spread waveform based ona relation

${\rho^{(n)}\lbrack t\rbrack} = {\sum\limits_{k = 1}^{K_{v}}{\sum\limits_{p = 1}^{L}{\sum\limits_{r}{{\delta\left\lbrack {t - {\hat{\tau}}_{kp}^{(n)} - {rN}_{c}} \right\rbrack} \cdot {\hat{a}}_{kp}^{(n)} \cdot {c_{k}\lbrack r\rbrack} \cdot {{\hat{b}}_{k}^{(n)}\left\lbrack \left\lfloor {r/N_{k}} \right\rfloor \right\rbrack}}}}}$wherein

-   -   K_(v) is a number of simultaneous dedicated physical channels        for all users,    -   δ[t] is a discrete-time delta function,    -   {circumflex over (α)}_(kp) ^((n)) is an estimated complex        channel amplitude for the p^(th) multipath component for the        k^(th) user,    -   c_(k)[r] represents a user code comprising at least a scrambling        code, an orthogonal variable spreading factor code, and a j        factor associated with even numbered dedicated physical        channels,    -   {circumflex over (b)}_(k) ^((n))[m] represents a soft symbol        estimate for the k^(th) user for the m^(th) symbol period,    -   {circumflex over (τ)}_(kp) ^((n)) is an estimated time lag for        the p^(th) multipath component for the k^(th) user,    -   N_(k) is a spreading factor for the k^(th) user,    -   t is a sample time index,    -   L is a number of multi-path components.,    -   N_(c) is a number of samples per chip, and    -   n is an iteration count.

Related aspects of the invention provide systems as described abovewherein the first logic element comprises arithmetic logic thatgenerates the estimated composite spread-spectrum waveform based on therelation

${{{\hat{r}}^{(n)}\lbrack t\rbrack} = {\sum\limits_{r}{{g\lbrack r\rbrack}{\rho^{(n)}\left\lbrack {t - r} \right\rbrack}}}},$wherein

-   -   {circumflex over (r)}^((n))[t] represents the estimated        composite spread-spectrum waveform,    -   g[t] represents a raised-cosine pulse shape.

Related aspects of the invention provide such systems that comprise aCDMA base station, e.g., of the type for use in relaying voice and datatraffic from cellular phone and/or modem users. Still further aspects ofthe invention provide improved spread spectrum communication systems asdescribed above in which the user waveforms are encoded using long-codeCDMA protocols.

Still other aspects of the invention provide methods multiple userdetection in a spread-spectrum communication system paralleling theoperations described above.

Wireless Communication Systems And Methods For Long-code CommunicationsFor Regenerative Multiple User Detection Involving Matched-filterOutputs

Further aspects of the invention provide an improved spread spectrumcommunication system, e.g., of the type described above, having firstlogic element operating in conjunction with a wireless base stationreceiver and/or modem, that generates an estimated compositespread-spectrum waveform as a function of user waveform characteristics,e.g., estimated complex amplitudes, time lags, symbols and code. Theinvention is further characterized by one or more second logic elementsthat generate for at least a selected user a refined matched-filterdetection statistic as a function of a difference between a firstmatched-filter detection statistic for that user and an estimatedmatched-filter detection statistic—the latter of which is a function ofthe estimated composite spread-spectrum waveform generated by the firstlogic element.

Related aspects of the invention as described above provide for improvedwireless communications wherein each of the second logic elementsgenerate the refined matched-filter detection statistic for the selecteduser as a function of a difference between (i) a sum of the firstmatched-filter detection statistic for that user and a characteristic ofan estimate of that user's spread-spectrum waveform, and (ii) theestimated matched-filter detection statistic for that user based on theestimated composite spread-spectrum waveform.

Further related aspects of the invention provide systems as describedabove in which the second logic elements comprise rake logic andsummation logic which generates refined matched-filter detectionstatistics for at least a selected user in accord with the relationy _(k) ^((n+1)) [m]=A _(k) ^((n)) ² ·{circumflex over (b)} _(k) ^((n))[m]+y _(k) ^((n)) [m]−y _(est,k) ^((n)) [m]wherein

-   -   A_(k) ^((n)) ² represents an amplitude statistic,    -   {circumflex over (b)}_(k) ^((n))[m] represents a soft symbol        estimate for the k^(th) user for the m^(th) symbol period,    -   y_(k) ^((n))[m] represents the first matched-filter detection        statistic,    -   y_(est,k) ^((n))[m] represents the estimated matched-filter        detection statistic, and    -   n is an iteration count.

Other related aspects of the invention include generating the refinedmatched-filter detection statistic for the selected user and iterativelyrefining that detection statistic zero or more times.

Related aspects of the invention as described above provide for improvedwireless communications methods wherein an estimated compositespread-spectrum waveform is based on the relation

${{y_{{est},k}^{(n)}\lbrack m\rbrack} \equiv {{Re}\left\{ {\sum\limits_{p = 1}^{L}{{{\hat{a}}_{kp}^{{(n)}H} \cdot \frac{1}{2N_{k}}}{\sum\limits_{r = 0}^{N_{k} - 1}{{{\hat{r}}^{(n)}\left\lbrack {{rN}_{c} + {\hat{\tau}}_{kp}^{(n)} + {mT}_{k}} \right\rbrack} \cdot {c_{km}^{*}\lbrack r\rbrack}}}}} \right\}}},$wherein

-   -   L is a number of multi-path components,    -   {circumflex over (α)}_(kp) ^((n)) is an estimated complex        channel amplitude for the p^(th) multipath component for the        k^(th) user,    -   N_(k) is a spreading factor for the k^(th) user,    -   {circumflex over (r)}^((n))[t] represents the estimated        composite spread-spectrum waveform,    -   N_(c) is a number of samples per chip,    -   {circumflex over (τ)}_(kp) ^((n)) is an estimated time lag for        the p^(th) multipath component for the k^(th) user,    -   m is a symbol period,    -   T_(k) is a data bit duration,    -   n is an iteration count, and    -   c_(km)[r] represents a user code comprising at least a        scrambling code, an orthogonal variable spreading factor code,        and a j factor associated with even numbered dedicated physical        channels.        Wireless Communication Systems And Methods For Long-code        Communications For Regenerative Multiple User Detection        Involving Pre-maximal Combination Matched Filter Outputs

Still further aspects of the invention provide improved-spread spectrumcommunication systems, e.g., of the type described above, having one ormore first logic elements, e.g., operating in conjunction with awireless base station receiver and/or modem, that generate a firstcomplex channel amplitude estimate corresponding to at least a selecteduser and a selected finger of a rake receiver that receives the selecteduser waveforms. One or more second logic elements generate an estimatedcomposite spread-spectrum waveform that is a function of one or morecomplex channel amplitudes, estimated delay lags, estimated symbols,and/or codes of the one or more user spread-spectrum waveforms. One ormore third logic elements generate a second pre-combinationmatched-filter detection statistic for at least a selected user and forat least a selected finger as a function of a first pre-combinationmatched-filter detection statistic for that user and a pre-combinationestimated matched-filter detection statistic for that user.

Related aspects of the invention provide systems as described above inwhich one or more fourth logic elements generate a second complexchannel amplitude estimate corresponding to at least a selected user andat least selected finger.

Still further aspects of the invention provide systems as describedabove in which the third logic elements generate the secondpre-combination matched-filter detection statistic for at least theselected user and at least the selected finger as a function of adifference between (i) the sum of the first pre-combinationmatched-filter detection statistic for that user and that finger and acharacteristic of an estimate of the selected user's spread-spectrumwaveform and (ii) the pre-combination estimated matched-filter detectionstatistic for that user and that finger.

Related aspects of the invention as described above provide for thefirst logic elements generating a complex channel amplitude estimatedcorresponding to at least a selected user and at least a selected fingerof a rake receiver that receives the selected user waveforms based on arelation

${\hat{a}}_{kp}^{(n)} \equiv {\sum\limits_{s}{{{w\lbrack s\rbrack} \cdot \frac{1}{N_{p}}}{\sum\limits_{m = 0}^{N_{p} - 1}{{y_{kp}^{(n)}\left\lbrack {m + {Ms}} \right\rbrack} \cdot {b_{k}^{(n)}\left\lbrack {m + {Ms}} \right\rbrack}}}}}$wherein

-   -   {circumflex over (α)}_(kp) ^((n)) is a complex channel amplitude        estimate corresponding to the p^(th) finger of the k^(th) user,    -   w[s] is a filter,    -   N_(p) is a number of symbols,    -   y_(kp) ^((n))[m] is a first pre-combination matched-filter        detection statistic corresponding to the p^(th) finger of the        k^(th) user for the m^(th) symbol period,    -   M is a number of symbols per slot,    -   {circumflex over (b)}_(k) ^((n))[m] represents a soft symbol        estimate for the k^(th) user for the m^(th) symbol period,    -   m is a number symbol period index,    -   s is a slot index, and    -   n is an iteration count.

Further related aspects of the invention as described above provide forone or more second logic elements, each coupled with a first logicelement and using the complex channel amplitudes generated therefrom togenerate an estimated composite re-spread waveform based on the relation

${{\rho^{(n)}\lbrack t\rbrack} = {\sum\limits_{k = 1}^{K_{v}}{\sum\limits_{p = 1}^{L}{\sum\limits_{r}{{\delta\left\lbrack {t - {\hat{\tau}}_{kp}^{(n)} - {rN}_{c}} \right\rbrack} \cdot {\hat{a}}_{kp}^{(n)} \cdot {c_{k}\lbrack r\rbrack} \cdot {{\hat{b}}_{k}^{(n)}\left\lbrack \left\lfloor {r/N_{k}} \right\rfloor \right\rbrack}}}}}},$wherein

-   -   K_(v) is a number of simultaneous dedicated physical channels        for all users,    -   δ[t] is a discrete-time delta function,    -   {circumflex over (α)}_(kp) ^((n)) is an estimated complex        channel amplitude for the p^(th) multipath component for the        k^(th) user,    -   c_(k)[r] represents a user code comprising at least a scrambling        code, an orthogonal variable spreading factor code, and a j        factor associated with even numbered dedicated physical        channels,    -   {circumflex over (b)}_(k) ^((n))[m] represents a soft symbol        estimate for the k^(th) user for the m^(th) symbol period,    -   {circumflex over (τ)}_(kp) ^((n)) is an estimated time lag for        the p^(th) multipath component for the k^(th) user,    -   N_(k) is a spreading factor for the k^(th) user,    -   t is a sample time index,    -   L is a number of multi-path components.,    -   N_(c) is a number of samples per chip, and    -   n is an iteration count.

Further related aspects of the invention provide systems as describedabove in which the second logic element comprises arithmetic logic thatgenerates the estimated composite spread-spectrum waveform based on arelation

${{\hat{r}}^{(n)}\lbrack t\rbrack} = {\sum\limits_{r}{{g\lbrack r\rbrack}{\rho^{(n)}\left\lbrack {t - r} \right\rbrack}}}$wherein

-   -   {circumflex over (r)}^((n))[t] represents the estimated        composite spread-spectrum waveform,    -   g[t] represents a pulse shape.

Still further related aspects of the invention provide systems asdescribed above in which the third logic elements comprise arithmeticlogic that generates the second pre-combination matched-filter detectionstatistic based on the relationy _(kp) ^((n+1)) [m]≡{circumflex over (α)} _(kp) ^((n)) ·{circumflexover (b)} _(k) ^((n)) [m]+y _(kp) ^((n)) [m]−y _(est,kp) ^((n)) [m]wherein

-   -   y_(kp) ^((n+1))[m] represents the pre-combination matched-filter        detection statistic for the p^(th) finger for the k^(th) user        for the m^(th) symbol period,    -   {circumflex over (α)}_(kp) ^((n)) is the complex channel        amplitude for the p^(th) finger for the k^(th) user,    -   {circumflex over (b)}_(k) ^((n))[m] represents a soft symbol        estimate for the k^(th) user for the m^(th) symbol period,    -   y_(kp) ^((n))[m] represents the first pre-combination        matched-filter detection statistic for the p^(th) finger for the        k^(th) user for the m^(th) symbol period,    -   y_(est,kp) ^((n))[m] represents the pre-combination estimated        matched-filter detection statistic for the p^(th) finger for the        k^(th) user for the m^(th) symbol period, and    -   n is an iteration count.

Still further aspects of the invention provide methods of operatingmultiuser detector logic, wireless base stations and/or other wirelessreceiving devices or systems operating in the manner of the apparatusabove. Further aspects of the invention provide such systems in whichthe first and second logic elements are implemented on any ofprocessors, field programmable gate arrays, array processors andco-processors, or any combination thereof. Other aspects of theinvention provide for interatively refining the pre-combinationmatched-filter detection statistics zero or more time.

Other aspects of the invention provide methods for an improvedspread-spectrum communication system as the type described above.

BRIEF DESCRIPTION OF THE ILLUSTRATED EMBODIMENT

A more complete understanding of the invention may be attained byreference to the drawings, in which:

FIG. 1 is a block diagram of components of a wireless base-stationutilizing a multi-user detection apparatus according to the invention.

FIG. 2 is a detailed diagram of a modem of the type that receivesspread-spectrum waveforms and generates a baseband spectrum waveformtogether with amplitude and time lag estimates as used by the invention.

FIGS. 3 and 4 depict methods according to the invention for multipleuser detection using explicitly regenerated user waveforms which areadded to a residual waveform.

FIG. 5 depicts methods according to the invention for multiple userdetection in which user waveforms are regenerated from a compositespread-spectrum pulsed-shaped waveform.

FIG. 6 depicts methods according to the invention for multiple userdetection using matched-filter outputs where a composite spread-spectrumpulse-shaped waveform is rake-processed.

FIG. 7 depicts methods according to the invention for multiple userdetection using pre-maximum ratio combined matched-filter output, wherea composite spread-spectrum pulse-shaped waveform is rake-processed.

FIG. 8 depicts an approach for processing user waveforms using full orpartial decoding at various time-transmission intervals based on userclass.

FIG. 9 depicts an approach for combining multi-path data across receivedframe boundaries to preserve the number of multi user detectionprocessing frame counts.

FIG. 10 illustrates the mapping of rake receiver output to virtual topreserve spreading factor and number of data channels across multipleuser detection processing frames where the data is linear and contiguousin memory.

FIG. 11 depicts a long-code loading implementation utilizing pipelinedprocessing and a triple-iteration of refinement in a system according tothe invention; and

FIG. 12 illustrates skewing of multiple user waveforms.

DETAILED DESCRIPTION OF THE ILLUSTRATED EMBODIMENT

Code-division multiple access (CDMA) waveforms or signals transmitted,e.g., from a user cellular phone, modem or other CDMA signal source, canbecome distorted by, and undergo amplitude fades and phase shifts due tophenomena such as scattering, diffraction and/or reflection offbuildings and other natural and man-made structures. This includes CDMA,DS/CDMA, IS-95 CDMA, CDMAOne, CDMA2000 1X, CDMA2000 1xEV-DO, WCDMA (orUTMS), and other forms of CDMA, which are collectively referred tohereinafter as CDMA or WCDMA. Often the user or other source(collectively, “user”) is also moving, e.g., in a car or train, addingto the resulting signal distortion by alternately increasing anddecreasing the distances to and numbers of building, structures andother distorting factors between the user and the base station.

In general, because each user signal can be distorted several differentways en route to the base station or other receiver (hereinafter,collectively, “base station”), the signal may be received in severalcomponents, each with a different time lag or phase shift. To maximizedetection of a given user signal across multiple tag lags, a rakereceiver is utilized. Such a receiver is coupled to one or more RFantennas (which serve as a collection point(s) for the time-laggedcomponents) and includes multiple fingers, each designed to detect adifferent multipath component of the user signal. By combining thecomponents, e.g., in power or amplitude, the receiver permits theoriginal waveform to be discerned more readily, e.g., by downstreamelements in the base station and/or communications path.

A base station must typically handle multiple user signals, and detectand differentiate among signals received from multiple simultaneoususers, e.g., multiple cell phone users in the vicinity of the basestation. Detection is typically accomplished through use of multiplerake receivers, one dedicated to each user. This strategy is referred toas single user detection (SUD). Alternately, one larger receiver can beassigned to demodulate the totality of users jointly. This strategy isreferred to as multiple user detection (MUD). Multiple user detectioncan be accomplished through various techniques which aim to discern theindividual user signals and to reduce signal outage probability orbit-error rates (BER) to acceptable levels.

However, the process has heretofore been limited due to computationalcomplexities which can increase exponentially with respect to the numberof simultaneous users. Described below are embodiments that overcomethis, providing, for example, methods for multiple user detectionwherein the computational complexity is linear with respect to thenumber of users and providing, by way of further example, apparatus forimplementing those and other methods that improve the throughput of CDMAand other spread-spectrum receivers. The illustrated embodiments areimplemented in connection with long-code CDMA transmitting and receiverapparatus; however those skilled in the art will appreciate that themethods and apparatus therein may be used in connection with short-codeand other CDMA signalling protocols and receiving apparatus, as well aswith other spread spectrum signalling protocols and receiving apparatus.In these regards and as used herein, the terms long-code and short-codeare used in their conventional sense: the former referring to codes thatexceed one symbol period; the latter, to codes that are a single symbolperiod or less.

Five embodiments of long-code regeneration and waveform refinement arepresented herein. The first two may be referred to as a base-lineembodiment and a residual signal embodiment. The remaining threeembodiments use implicit waveform subtraction, matched-filter outputsrather than antenna streams and pre-maximum ratio combination ofmatched-filter outputs. It will be appreciated by those skilled in theart, that other modifications to these techniques can be implementedthat produce the like results based on modifications of the methodsdescribed herein.

FIG. 1 depicts components of a wireless base station 100 of the type inwhich the invention is practiced. The base station 100 includes anantenna array 114, radio frequency/intermediate frequency (RF/IF)analog-to-digital converter (ADC), multi-antenna receivers 110, rakemodems 112, MUD processing logic 118 and symbol rate processing logic120, coupled as shown.

Antenna array 114 and receivers 110 are conventional such devices of thetype used in wireless base stations to receive wideband CDMA(hereinafter “WCDMA”) transmissions from multiple simultaneous users(here, identified by numbers 1 through K). Each RF/IF receiver (e.g.,110) is coupled to antenna or antennas 114 in the conventional mannerknown in the art, with one RF/IF receiver 110 allocated for each antenna114. Moreover, the antennas are arranged per convention to receivecomponents of the respective user waveforms along different laggedsignal paths discussed above. Though only three antennas 114 and threereceivers 110 are shown, the methods and systems taught herein may beused with any number of such devices, regardless of whether configuredas a base station, a mobile unit or otherwise. Moreover, as noted above,they may be applied in processing other CDMA and wireless communicationssignals.

Each RF/IF receiver 110 routes digital data to each modem 112. Becausethere are multiple antennas, here, Q of them, there are typically Qseparate channel signals communicated to each modem card 112.

Generally, each user generating a WCDMA signal (or other subjectwireless communication signal) received and processed by the basestation is assigned a unique long-code code sequence for purpose ofdifferentiating between the multiple user waveforms received at thebasestation, and each user is assigned a unique rake modem 112 forpurpose of demodulating the user's received signal. Each modem 112 maybe independent, or may share resources from a pool. The rake modems 112process the received signal components along fingers, with each receiverdiscerning the signals associated with that receiver's respective usercodes. The received signal components are denoted here as r_(kq)[t]denoting the channel signal (or waveform) from the k^(th) user from theq^(th) antenna, or r_(k)[t] denoting all channel signals (or waveforms)originating from the k^(th) user, in which case r_(k)[t] is understoodto be a column vector with one element for each of the Q antennas. Themodems 112 process the received signals r_(k)[t] to generate detectionstatistics y_(k) ⁽⁰⁾[m] for the k^(th) user for the m^(th) symbolperiod. To this end, the modems 122 can, for example, combine thecomponents r_(kq)[t] by power, amplitude or otherwise, in theconventional manner to generate the respective detection statisticsy_(k) ⁽⁰⁾[m]. In the course of such processing, each modem 112determines the amplitude (denoted herein as α) of and time lag (denotedherein as τ) between the multiple components of the respective userchannel. The modems 112 can be constructed and operated in theconventional manner known in the art, optionally, as modified in accordwith the teachings of some of the embodiments below.

The modems 112 route their respective user detection statistics y_(k)⁽⁰⁾[m], as well as the amplitudes and time lags, to common userdetection (MUD) 118 logic constructed and operated as described in thesections that follow. The MUD logic 118 processes the received signalsfrom each modem 112 to generate a refined output, y_(k) ⁽¹⁾[m], or moregenerally, y_(k) ^((n))[m], where n is an index reflecting the number oftimes the detection statistics are iteratively or regenerativelyprocessed by the logic 118. Thus, whereas the detection statisticproduced by the modems is denoted as y_(k) ⁽⁰⁾[m] indicating that therehas been no refinement, those generated by processing the y_(k) ⁽⁰⁾[m]detection statistics with logic 118 are denoted y_(k) ⁽⁰⁾[m], thosegenerated by processing the y_(k) ⁽¹⁾[m] detection statistics with logic118 are denoted y_(k) ⁽²⁾[m], and so forth. Further waveforms used andgenerated by logic 118 are similarly denoted, e.g., r^((n))[t].

Though discussed below are embodiments in which the logic 118 isutilized only once, i.e., to generate y_(k) ⁽¹⁾[m] from y_(k) ⁽⁰⁾[m],other embodiments may employ that logic 118 multiple times to generatestill more refined detection statistics, e.g., for wirelesscommunications applications requiring lower bit error rates (BER). Forexample, in some implementations, a single logic stage 118 is used forvoice applications, whereas two or more logic stages are used for dataapplications. Where multiple stages are employed, each may be carriedout using the same hardware device (e.g., processor, co-processor orfield programmable gate array) or with a successive series of suchdevices.

The refined user detection statistics, e.g., y_(k) ⁽¹⁾[m] or moregenerally y_(k) ^((n))[m], are communicated by the MUD process 118 to asymbol process 120. This determines the digital information containedwithin the detection statistics, and processes (or otherwise directs)that information according to the type of user class for which the userbelongs, e.g., voice or data user, all in the conventional manner.

Though the discussion herein focuses on use of MUD logic 118 in awireless base station, those skilled in the art will appreciate that theteachings hereof are equally applicable to MUD detection in any otherCDMA signal processing environment such as, by way of non-limitingexample, cellular phones and modems. For convenience, such cellular basestations other environments are referred to herein as “base stations.”

Referring to FIG. 2, modem 112 receives the channel-signals r[t] 112from the RF/IC receiver (FIG. 1). The signals are first input into asearcher receiver 212. The searcher receiver analyzes the digitalwaveform input, and estimates a time offset {circumflex over (τ)}_(kp)^((n)) for each signal component (e.g. for each finger). As thoseskilled in the art will appreciate, the “hat” or ^ symbol denotesestimated values. The time offset for each antenna channel iscommunicated to a corresponding rake receiver 214.

The rake receiver receivers 214 receive both the digital signals r[t]from the RF/IF receivers, and the time offsets, {circumflex over(τ)}_(kp) ^((n)). The receivers 214 calculate the pre-combinationmatched-filter detection statistics, y_(kp) ⁽⁰⁾[m], and estimate signalamplitude, {circumflex over (α)}_(kp) ^((n)), for each of the signals.The amplitudes are complex in value, and hence include both themagnitude and phase information. The pre-combination matched-filterdetection statistics, y_(kp) ⁽⁰⁾[m], and the amplitudes {circumflex over(α)}_(kp) ^((n)) for each finger receiver 212, are routed to a maximalratio combining (MRC) 216 process and combined to form a firstapproximation of the symbols transmitted by each user, denoted y_(k)⁽⁰⁾[m]. While the MRC 216 process is utilized in the illustratedembodiment, other methods for combining the multiple signals are knownin the art, e.g., optimal combining, equal gain combining and selectioncombining, among others, and can be used to achieve the same results.

At this point, it can be appreciated by one skilled in the art that eachdetection statistic, y_(k) ⁽⁰⁾[m], contains not only the signaloriginating from user k, but also has components (e.g., interference andnoise) that have originated in the channel (e.g., the environment inwhich the signal was propagated and/or in the receiving apparatusitself). Hence, it is further necessary to differentiate each user'ssignal from all others. This function is provided by the multiple userdetection (MUD) card 118.

The methods and apparatus described below provide for processinglong-code WCDMA at sample rates and can be introduced into aconventional base station as an enhancement to the matched-filter rakereceiver. The algorithms and processes can be implemented in hardware,software, or any combination of the two including firmware, fieldprogrammable gate arrays (FPGAs), co-processors, and/or arrayprocessors.

The following discussion illustrates the calculations involved in theillustrated multiple user detection process. For the followingdiscussion, and as can be recognized by one skilled in the art, the termphysical user refers to an actual user. Each physical user is regardedas a composition of virtual users. The concept of virtual users is usedto account for both the dedicated physical data channels (DPDCH) and thededicated physical control channel (DPCCH). There are 1+N_(dk) virtualusers corresponding to the k^(th) physical user, where N_(dk) is thenumber of DPDCHs for the k^(th) user.

As one with ordinary skill in the art can appreciate, when long-codesare used, the base-band received signals, r[t], which is a column vectorwith one element per antenna, can be modeled as:

$\begin{matrix}{{r\lbrack t\rbrack} = {{\sum\limits_{k = 1}^{K_{v}}{\sum\limits_{m}{{{\overset{\sim}{s}}_{km}\left\lbrack {t - {mT}_{k}} \right\rbrack}{b_{k}\lbrack m\rbrack}}}} + {w\lbrack t\rbrack}}} & (1)\end{matrix}$where t is the integer time sample index, K_(v) is the number of virtualusers, T_(k)=N_(k)N_(c) is the channel symbol duration, which depends onthe user spreading factor, N_(k) is the spreading factor for the k^(th)virtual user, N_(c) is the number of samples per chip, w[t] is receivernoise and other-cell interference, {tilde over (s)}_(km)[t] is thechannel-corrupted signature waveform for the k^(th) virtual user overthe m^(th) symbol period, and b_(k)[m] is the channel symbol for thek^(th) virtual user over the m^(th) symbol period.

Since long-codes extend over many symbol periods, the user signaturewaveform and hence the channel-corrupted signature waveform vary fromsymbol period to symbol period. For L multi-path components, thechannel-corrupted signature waveform for the k^(th) virtual user ismodeled as,

$\begin{matrix}{{{\overset{\sim}{s}}_{km}\lbrack t\rbrack} = {\sum\limits_{p = 1}^{L}{a_{kp}{s_{km}\left\lbrack {t - \tau_{kp}} \right\rbrack}}}} & (2)\end{matrix}$where α_(kp) are the complex multi-path amplitudes. The amplitude ratiosβ_(k) are incorporated into the amplitudes α_(kp). One skilled in theart will see that if k and l are virtual users corresponding to theDPCCH and the DPDCHs of the same physical user, then, aside from scalingby β_(k) and β_(l), the amplitudes α_(kp) and α_(lp) are equal. This isdue to the fact that the signal waveforms for both the DPCCH and theDPDCH pass through the same channel.

The waveform s_(km)[t] is referred to as the signature waveform for thek^(th) virtual user over the m^(th) symbol period. This waveform isgenerated by passing the code sequence C_(km)[n] through a pulse-shapingfilter g[t],

$\begin{matrix}{{s_{km}\lbrack t\rbrack} = {\sum\limits_{r = 0}^{N_{k} - 1}{{g\left\lbrack {t - {rN}_{c}} \right\rbrack}{c_{km}\lbrack r\rbrack}}}} & (3)\end{matrix}$where g[t] is the raised-cosine pulse shape. Since g[t] is a raisedcosine pulse as opposed to a root-raised-cosine pulse, the receivedsignal r[t] represents the baseband signal after filtering by thematched chip filter. The code sequence c_(km)[r]≡c_(k)[r+mN_(k)]represents the combined scrambling code, orthogonal variable spreadingfactor (OVSF) code and j factor associated with even numbered DPDCHs.

The received signal r[t] which has been match-filtered to the chip pulseis next match-filtered by the user long-code sequence filter andcombined over multiple fingers. The resulting detection statistic isdenoted here as y_(l)[m], the matched-filter output for the l^(th)virtual user over the m^(th) symbol period. The matched-filter outputy_(l)[m] for the l^(th) virtual user can be written,

$\begin{matrix}{{y_{l}\lbrack m\rbrack} \equiv {{Re}\left\{ {\sum\limits_{q = 1}^{L}{{{\hat{a}}_{lq}^{H} \cdot \frac{1}{2N_{l}}}{\sum\limits_{n = 0}^{N_{l} - 1}{{r\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}}} \right\}}} & (4)\end{matrix}$where {circumflex over (α)}_(lq) ^(H) is the estimate of α_(lq) ^(H),and {circumflex over (τ)}_(lq) is the estimate of τ_(lq).

Because of the extreme computational complexity of symbol-rate multipleuser detection for long-codes, it is advantageous to resort toregenerative multiple user detection when long-codes are used. Althoughregenerative multiple user detection operates at the sample rate, forlong-codes the overall complexity is lower than with symbol-ratemultiple user detection. Symbol-rate multiple user detection requirescalculating the correlation matrices every symbol period, which isunnecessary with the signal regeneration methods described herein.

For regenerative multiple user detection, the signal waveforms ofinterferers are regenerated at the sample rate and effectivelysubtracted from the received signal. A second pass through the matchedfilter then yields improved performance. The computational complexity ofregenerative multiple user detection is linear with the number of users.

By way of review, the implementation of the regenerative multiple userdetection can be implemented as a baseline implementation. Referringback to the received signal, r[t]

$\begin{matrix}{\begin{matrix}{{r\lbrack t\rbrack} = {{\sum\limits_{k = 1}^{K_{v}}{\sum\limits_{m}{\sum\limits_{p = 1}^{L}{a_{kp}{s_{km}\left\lbrack {t - \tau_{kp} - {mT}_{k}} \right\rbrack}{b_{k}\lbrack m\rbrack}}}}} + {w\lbrack t\rbrack}}} \\{= {{\sum\limits_{k = 1}^{K_{v}}{r_{k}\lbrack t\rbrack}} + {w\lbrack t\rbrack}}}\end{matrix}{{r_{k}\lbrack t\rbrack} \equiv {\sum\limits_{m}{\sum\limits_{p = 1}^{L}{a_{kp}{s_{km}\left\lbrack {t - \tau_{kp} - {mT}_{k}} \right\rbrack}{b_{k}\lbrack m\rbrack}}}}}} & (5)\end{matrix}$

For the baseline implementation, all estimated interference issubtracted yielding a cleaned-up signal {circumflex over (r)}_(l)^((n+1))[t] as follows:

$\begin{matrix}{{{{\hat{r}}_{l}^{({n + 1})}\lbrack t\rbrack} = {{r\lbrack t\rbrack} - {\sum\limits_{\underset{k \neq 1}{k = 1}}^{K_{v}}{{\hat{r}}_{k}^{(n)}\lbrack t\rbrack}}}}{{{\hat{r}}_{k}^{(n)}\lbrack t\rbrack} \equiv {\sum\limits_{m}{\sum\limits_{p = 1}^{L}{{\hat{a}}_{kp}^{(n)} \cdot {s_{km}\left\lbrack {t - {\hat{\tau}}_{kp}^{(n)} - {mT}_{k}} \right\rbrack} \cdot {b_{k}^{(n)}\lbrack m\rbrack}}}}}} & (6)\end{matrix}$

The implementation represented by Equation (6) corresponds to a totalsubtraction of the estimated interference. One skilled in the art willappreciate that performance can typically be improved if only a fractionof the total estimated interference is subtracted (i.e., partialinterference subtraction), this owing to channel and symbol estimationerrors. Equation (6) is easily modified so as to incorporate partialinterference cancellation by introducing a multiplicative constant ofmagnitude less than unity to the sum total of the estimatedinterference. When multiple cancellation stages are used the optimumvalue of this constant is different for each stage.

The above equations are implemented in the baseline long-code multipleuser detection process 118 as illustrated in FIG. 3. The receiverbase-band signal r[t] 122 is input to the rake receiver cards 112 (i.e.,one rake receiver for each user) as described above. Each of the rakereceivers 112 processes the base-band signal r[t] 122 and outputs thefirst approximation of the transmitted symbol, y_(k) ⁽⁰⁾[m] 304 for eachuser k (e.g., user 1 through user K), as well as the estimated amplitude{circumflex over (α)}_(kp) ⁽⁰⁾, time lag {circumflex over (τ)}_(kp) ⁽⁰⁾and user code 306. For ease of notation, here, the superscript refers tothe n^(th) regeneration iteration. Hence, for example, {circumflex over(α)}_(kp) ⁽⁰⁾ refers to the base-band because no iterations have beenperformed.

The y_(k) ⁽⁰⁾[m] 304 output from the rake receiver 112 is input into adetector which outputs hard or soft symbol estimates {circumflex over(b)}_(k) ⁽⁰⁾[m] used to cancel the effects of multiple accessinterference (MAI). One skilled in the art will appreciate that manydifferent detectors may be used, including the hard-limiting (signfunction) detector, the null-zone detector, the hyperbolic tangentdetector and the linear-clipped detector, and that soft detectors (allbut the first listed above) typically yield improved performance.

The outputs from the rake receivers 112 and the soft symbol estimatesare input into a respreading process 310 which assembles an estimatedspread-spectrum waveform corresponding to the selected user but withoutpulse shaping. The re-spread signals are input into the raised-cosinefilter 312 which produces an estimate of the received spread-spectrumwaveform for the selected user.

The raised-cosine pulse shaping process accepts the signals from each ofthe respread processes (e.g., one for each user), and produces theestimated user waveforms {circumflex over (r)}_(k) ^((n))[t]. Next, thewaveforms {circumflex over (r)}_(k) ^((n))[t] are further processed in aseries of summation processes 314, 316, 318 to determine each user'scleaned-up signal {circumflex over (r)}_(l) ^((n+1))[t] according to theabove equation (6).

Therefore, for example, to determine the signal corresponding to the1^(st) user, the base-band signal r[t] 122 from the RF/IF receivers 110containing information from all simultaneous users is reduced by theestimated signals {circumflex over (r)}_(k) ^((n))[t] for all usersexcept the 1^(st) user. After the subtraction of the {circumflex over(r)}_(k) ^((n))[t] signals (e.g., {circumflex over (r)}₂ ^((n))[t]through {circumflex over (r)}_(K) _(v) ^((n))[t] as illustrated), theremainder signal contains predominately the signal for the 1st user.Hence, the summation function 314, applies the above equation (6) toproduce the cleaned up signal {circumflex over (r)}₁ ^((n+1))[t]. Thisprocess is performed for each simultaneous user.

The output from the summation processes 314, 316, 318 is supplied to therake receivers 320 (or re-applied to the original rake receivers 112).The resulting signal produced by the rake receivers 320 is the refinedmatched-filter detection statistic y_(k) ⁽¹⁾[m]. The superscript (1)indicates that this is the first iteration on the base-band signal.Hence, the base-line long-code multiple user detection is implemented.As illustrated, only one iteration is performed, however, in otherembodiments, multiple iterations may be performed depending onlimitations (e.g., computational complexity, bandwidth, and otherfactors).

It can be appreciated by one skilled in the art that the above methodsare limited by bandwidth and computational complexity. Specifically, forexample, if K=128 , i.e., there are 128 simultaneous users for thisimplementation, the total bisection bandwidth is 998.8 Gbytes/second,determined with the following assumption, for example:3.84 Mchips/sec/antenna/stream×2 antennas×8 samples/chip×1bytes/sample×128(128−1) streams=998.8 Gbytes/sec

The computational complexity is calculated in terms of billionoperations per second (GOPS), and is calculated separately for each ofthe processes of re-spreading, raised-cosine filtering, interferencecancellation (IC), and the finger receiver operations. The re-spreadprocess involves amplitude-chip-bit multiply-accumulate operations(macs). Assuming, for example, that there are only four possible chipsand further that the amplitude chip multiplications are performed via atable look-up requiring zero GOPS, then the re-spread computationalcomplexity is the (amplitude-chip)×(bit macs). Therefore, the re-spreadcomputational cost (in GOPS) is:3.84 Mchips/sec/antenna/finger/virtual-user/multiple user detectionstage×2 antennas×4 fingers×256 virtual users×1 multiple user detectionstage×4 ops/chip (real×complex mac)=31.5 GOPS

Based on the same assumptions, the raised-cosine filter requires:3.84 Mchips/sec/antenna/physical-user/multiple user detection stage×8samples/chip×2 antennas×128 physical users×1 multiple user detectionstage×6 ops/sample/tap (complex additions then real×complex mac)×24 taps(using symmetry)=1,132.5 GOPS

The computational cost of the IC process is3.84 Mchips/sec/antenna/physical-user/multiple user detection stage×8samples/chip×2 antennas×128 physical users×1 multiple user detectionstage×2 ops/sample/physical users (complex add))×128 users=2,013.3 GOPS

Finally, the computational complexity for the rake receiver processesis:3.84 Mchips/sec/antenna/physical-user/multiple user detection stage×2antennas×4 fingers×256 virtual users×1 multiple user detection stage×8ops/chip (complex mac)=62.9 GOPS

Summing the separate computational complexities for each of the aboveprocesses yields the following results:

Process GOPS Re-Spread 31.5 Raised Cosine Filtering 1,132.5 IC 2,013.3Finger Receivers 62.9 TOTAL 3,240.2

However, both the bandwidth and computation complexity are reduced byemploying a residual-signal implementation as now described. Thebandwidth can be reduced by forming the residual signal, which is thedifference between the received signal and the total (i.e., all usersand all multi-paths) estimated signal. Then, the cleaned-up signal{circumflex over (r)}_(l) ^((n+1))[t] expressed in terms of the residualsignal is:

$\begin{matrix}{\begin{matrix}{{{\hat{r}}_{l}^{({n + 1})}\lbrack t\rbrack} = {{r\lbrack t\rbrack} - {\sum\limits_{\underset{k \neq l}{k = 1}}^{K_{v}}{{\hat{r}}_{k}^{(n)}\lbrack t\rbrack}}}} \\{= {{{\hat{r}}_{l}^{(n)}\lbrack t\rbrack} + {r\lbrack t\rbrack} - {\sum\limits_{k = 1}^{K_{v}}{{\hat{r}}_{k}^{(n)}\lbrack t\rbrack}}}} \\{= {{{\hat{r}}_{l}^{(n)}\lbrack t\rbrack} + {r_{res}^{(n)}\lbrack t\rbrack}}}\end{matrix}{{r_{rest}^{(n)}\lbrack t\rbrack} \equiv {{r\lbrack t\rbrack} - {{\hat{r}}^{(n)}\lbrack t\rbrack}}}{{r^{(n)}\lbrack t\rbrack} \equiv {\sum\limits_{k = 1}^{K_{v}}{r_{k}^{(n)}\lbrack t\rbrack}}}} & (7)\end{matrix}$

This implementation is illustrated in FIG. 4. One skilled in the art canrecognize that through the point of determining the output from theraised-cosine filters, the residual signal implementation is identicalwith that above illustrated within FIG. 3. It is at this point, theresidual signal implementation varies as now described.

A summation process 402 calculates r_(res) ^((n))[t] according toequation (7) above by accepting the base-band signal r[t] andsubtracting the signal {circumflex over (r)}^((n))[t] (i.e., the outputfrom all of the raised-cosine filters 310).

Differing from the baseline implementation, here, a first summationprocess 402 is performed by subtracting from the baseband signal r[t]122 the output from each raised-cosine pulse shaping process 310. Thisproduces the residual signal r_(res) ^((n))[t] corresponding to thebase-band signal and the total (e.g., all users in all multi-paths)estimated signal.

The residual signal r_(res) ^((n))[t] is supplied to a further summationprocess for each user (e.g., 404) where the output from that user'sraised-cosine pulse shaping process 312 is added to the r_(res)^((n))[t] signal as described in above equation (7), thus determiningthe cleaned-up signal {circumflex over (r)}_(l) ^((n+1))[t] for eachuser.

Next, as with the baseline implementation, the cleaned-up signal{circumflex over (r)}_(l) ^((n+1))[t] for each user is supplied to arake receiver 320 (or reapplied to 112) for processing into theresultant y_(l) ^((n+1))[m] detection statistics ready for processing bythe symbol processor 120.

One skilled in the art can recognize that both the bandwidth andcomputational complexity is improved (i.e., lowered) for thisimplementation compared with the base-line implementation describedabove. Specifically, continuing with the assumptions used in determiningthe bandwidth and computational complexity as above and applying thoseassumptions to the residual-signal implementation, the bandwidth can beestimated as follows:3.84 Mchips/sec/antenna/stream×2 antennas×8 samples/chip×1bytes/sample×129 streams=7.9 Gbytes/sec

The computational complexity for each of the processes is as follows:the re-spreading and raised-cosine are the same as with the baselineimplementation.

For the IC processes, the computational complexity is:3.84 Mchips/sec/antenna/physical-user/multiple user detection stage×8samples/chip×2 antennas×128 physical users×1 multiple user detectionstage×2 ops/sample/waveform addition (complex add)) 3 waveformadditions=47.2 GOPS

Finally, the finger receiver processes are the same as with thebase-line implementation above. Therefore, summing the separatecomputational complexities for each of the above processes yields thefollowing results:

Process GOPS Re-Spread 31.5 Raised Cosine Filtering 1,132.5 IC 47.2Finger Receivers 62.9 TOTAL 1,274.1

Therefore, both the bandwidth and computational complexity is improved,however, it can be recognized by one skilled in the art that even withsuch improvement, the computational complexity may be a limiting factor.

Further improvement is possible and is now described within in thefollowing three embodiments, although other embodiments can berecognized by one skilled in the art. One improvement is to utilize aimplicit waveform subtraction rather than the explicit waveformsubtraction described for use with both the baseline implementation andthe residual long-code implementation above. A considerable reduction incomputational complexity results if the individual user waveforms arenot explicitly calculated, but rather implicitly calculated.

The illustrated embodiment utilize implicit waveform subtraction byexpanding on equation (7) above, and using approximations as shown belowin equation (8).

$\begin{matrix}{{{y_{l}^{({n + 1})}\lbrack m\rbrack} \equiv {{Re}\left\{ {\sum\limits_{q = 1}^{L}{{\hat{a}}_{lq}^{{(n)}H}\frac{1}{2N_{l}}{\sum\limits_{n = 0}^{N_{l} - 1}{{{\hat{r}}_{l}^{({n + 1})}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(n)} + {mT}_{l}} \right\rbrack}{c_{lm}^{*}\lbrack n\rbrack}}}}} \right\}}} = {{{{Re}\left\{ {\sum\limits_{q = 1}^{L}{{\hat{a}}_{lq}^{{(n)}H}\frac{1}{2N_{l}}{\sum\limits_{n = 0}^{N_{l} - 1}{{{\hat{r}}_{l}^{(n)}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(n)} + {mT}_{l}} \right\rbrack}{c_{lm}^{*}\lbrack n\rbrack}}}}} \right\}} + {{Re}\left\{ {\sum\limits_{q = 1}^{L}{{\hat{a}}_{lq}^{{(n)}H}\frac{1}{2N_{l}}{\sum\limits_{n = 0}^{N_{l} - 1}{{r_{res}^{(n)}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(n)} + {mT}_{l}} \right\rbrack}{c_{lm}^{*}\lbrack n\rbrack}}}}} \right\}}} = {{{{{Re}\left\{ {\sum\limits_{q = 1}^{L}{{\hat{a}}_{lq}^{{(n)}H}\frac{1}{2N_{l}}{\sum\limits_{n = 0}^{N_{l - 1}}{\left\lbrack {\sum\limits_{m^{\prime}}{\sum\limits_{q^{\prime} = 1}^{L}{{\hat{a}}_{{lq}^{\prime}}^{(n)}{s_{{lm}^{\prime}}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(n)} - {\hat{\tau}}_{{lq}^{\prime}}^{(n)} + {\left( {m - m^{\prime}} \right)T_{l}}} \right\rbrack}{{\hat{b}}_{l}^{(n)}\left\lbrack m^{\prime} \right\rbrack}}}} \right\rbrack{c_{lm}^{*}\lbrack n\rbrack}}}}} \right\}} + {y_{{res},l}^{(n)}\lbrack m\rbrack}} \cong {{{Re}{\left\{ {\sum\limits_{q = 1}^{L}{{\hat{a}}_{lq}^{{(n)}H}\frac{1}{2N_{l}}{\sum\limits_{n = 0}^{N_{l} - 1}{\left\lbrack {\sum\limits_{q^{\prime} = 1}^{L}{{\hat{a}}_{{lq}^{\prime}}^{(n)}{s_{lm}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(n)} - {\hat{\tau}}_{{lq}^{\prime}}^{(n)}} \right\rbrack}}} \right\rbrack{c_{lm}^{*}\lbrack n\rbrack}}}}} \right\} \cdot {{\hat{b}}_{l}^{(n)}\lbrack m\rbrack}}} + {y_{{res},l}^{(n)}\lbrack m\rbrack}}} = {{{{{Re}\left\{ {\sum\limits_{q = 1}^{L}{\sum\limits_{q^{\prime} = 1}^{L}{{\hat{a}}_{lq}^{{(n)}H}{\hat{a}}_{{lq}^{\prime}}^{(n)}\frac{1}{2N_{l}}{\sum\limits_{n = 0}^{N_{l} - 1}{{s_{lm}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(n)} - {\hat{\tau}}_{{lq}^{\prime}}^{(n)}} \right\rbrack}{c_{lm}^{*}\lbrack n\rbrack}}}}}} \right\}{{\hat{b}}_{l}^{(n)}\lbrack m\rbrack}} + {y_{{res},l}^{(n)}\lbrack m\rbrack}} \cong {{{Re}{\left\{ {\sum\limits_{q = 1}^{L}{{\hat{a}}_{lq}^{{(n)}H}{\hat{a}}_{lq}^{(n)}}} \right\} \cdot {b_{l}^{(n)}\lbrack m\rbrack}}} + {y_{{res},l}^{(n)}\lbrack m\rbrack}}} = {{A_{l}^{{(n)}2} \cdot {{\hat{b}}_{l}^{(n)}\lbrack m\rbrack}} + {y_{{res},l}^{(n)}\lbrack m\rbrack}}}}}} & (8)\end{matrix}$$A_{l}^{{(n)}2} \equiv {{Re}\left\{ {\sum\limits_{q = 1}^{L}{{\hat{a}}_{lq}^{{(n)}H}{\hat{a}}_{lq}^{(n)}}} \right\}}$${y_{{res},l}^{(n)}\lbrack m\rbrack} \equiv {{Re}\left\{ {\sum\limits_{q = 1}^{L}{{{\hat{a}}_{lq}^{{(n)}H} \cdot \frac{1}{2N_{l}}}{\sum\limits_{n = 0}^{N_{l} - 1}{{r_{res}^{(n)}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(n)} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}}} \right\}}$

The two approximations used, as indicated within equation (8), includeneglecting inter-symbol interference terms for the user of interest, andfurther, neglecting cross-multi-path interference terms for the user ofinterest. Because the user of interest term has a strong deterministicterm, the omission of these low-level random contributions is justified.These contributions could be included in a more detailed embodimentwithout incurring excessive increases in computational complexity.However, implementation computational complexity would increasesomewhat. Such an embodiment may be appropriate for high data-rate, lowspreading factor users where inter-symbol and cross multi-path term arelarger.

A noteworthy aspect of equation (8) above is that the rake receiveroperation on the estimated user of interest signal {circumflex over(r)}_(l) ^((n))[t] can be calculated analytically. Thus, the signal neednot be explicitly formed, but rather, the corresponding contribution isadded after the rake receiver operation on the residual signal alone.Now referring to FIG. 5, this implicit waveform subtractionimplementation is illustrated.

One skilled in the art can glean from the illustration that separatere-spreading and raised-cosine processing is no longer performed on eachindividual user signal, but rather, is performed only once on thebaseband composite re-spread signal ρ^((n))[t]. Thus, the re-spreadprocess 312 accumulates the composite signal ρ^((n))[t] based on theamplitudes {circumflex over (α)}_(kp) ^((n)), time lags {circumflex over(τ)}_(kp) ^((n)) and user codes. The output from the re-spreadingprocess produces another composite signal {circumflex over (r)}^((n))[t]502 as described below and in equation (9).

At this point, it is of note that a substantial reduction incomputational complexity accrues due to not having to explicitlycalculate the individual user estimated waveforms. As illustrated inFIG. 5, the individual user waveforms are not required, hence, thecomposite signal ρ^((n))[t] 502 representing the sum of all estimateduser waveforms can be formed by calculating this composite waveformfirst without performing the raised-cosine filtering process on eachindividual waveform. Only one filtering operation need be performed,which represents a substantial reduction in computational complexity.

The form of ρ^((n))[t] is as follows:

$\begin{matrix}{{{\rho^{(n)}\lbrack t\rbrack} = {\sum\limits_{k = 1}^{K_{v}}{\sum\limits_{p = 1}^{L}{\sum\limits_{r}{{\delta\left\lbrack {t - {\hat{\tau}}_{kp}^{(n)} - {rN}_{c}} \right\rbrack} \cdot {\hat{a}}_{kp}^{(n)} \cdot {c_{k}\lbrack r\rbrack} \cdot {{\hat{b}}_{k}^{(n)}\left\lbrack \left\lfloor {r/N_{k}} \right\rfloor \right\rbrack}}}}}}{{{\hat{r}}^{(n)}\lbrack t\rbrack} = {\sum\limits_{r}{{g\lbrack r\rbrack}{\rho^{(n)}\left\lbrack {t - r} \right\rbrack}}}}} & (9)\end{matrix}$

Now that an understanding of the composite waveform ρ^((n))[t] isaccomplished, referring back to FIG. 5, this waveform is transformedinto {circumflex over (r)}(n)[t] via the raised-cosine pulse shapingfilter 312. From here, a summation process 506 subtracts {circumflexover (r)}^((n))[t] from the base-line waveform r[t] producing theresidual waveform r_(res) ^((n))[t] as shown above (e.g., in equation(7)).

Unlike the residual signal implementation described above, here, ther_(res) ^((n))[t] is applied directly to the rake receivers 506 (orreapplied to the rake receivers 112) for each user together with theuser code for that user. The output from each rake receiver is appliedto a summation process, where the A₁ ^((n)) ² ·{circumflex over (b)}_(l)^((n))[m] values are added to the rake receiver output as describedabove in equation (8) producing the y_(l) ^((n+1))[m] detectionstatistics suitable for symbol processing 120.

The computational complexity of this embodiment is reduced as nowdescribed. The re-spread processing and rake receiver computationalcosts are the same as with the previous implementations. However, theraise-cosine filtering and interference cancellation computational costis now:

For the raised-cosine filtering,3.84 Mchips/sec/antenna/multiple user detection stage×8 samples/chip×2antennas×1 multiple user detection stage×6 ops/sample (complex additionthen real×complex mac)×24 taps (using symmetry)=8.8 GOPSThe computational cost of the IC process is3.84 Mchips/sec/antenna/multiple user detection stage×8 samples/chip×2antennas×1 multiple user detection stage×2 ops/sample/waveform addition(complex add)×1 waveform addition=0.123 GOPS

Summing the separate computational complexities for each of the aboveprocesses yields the following results:

Process GOPS Re-Spread 31.5 Raised Cosine Filtering 8.8 IC 0.1 FingerReceivers 62.9 TOTAL 103.3

Another embodiment using matched-filter outputs rather than antennastreams is now presented. This embodiment follows from equation (8)above where the rake receiver outputs are:

$\begin{matrix}{{y_{{res},l}^{(n)}\lbrack m\rbrack} \equiv {{Re}\left\{ {\sum\limits_{q = 1}^{L}{{{\hat{a}}_{lq}^{{(n)}H} \cdot \frac{1}{2N_{l}}}{\sum\limits_{n = 0}^{N_{l} - 1}{{r_{res}^{(n)}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(n)} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}}} \right\}}} & (10)\end{matrix}$and further user equation (7) above, equation (10) can be re-written as:

$\begin{matrix}\begin{matrix}{{y_{{res},l}^{(n)}\lbrack m\rbrack} \equiv {{Re}\left\{ {\sum\limits_{q = 1}^{L}{{{\hat{a}}_{lq}^{{(n)}H} \cdot \frac{1}{2N_{l}}}{\sum\limits_{n = 0}^{N_{l} - 1}{{r_{res}^{(n)}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(n)} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}}} \right\}}} \\{= {{{Re}\left\{ {\sum\limits_{q = 1}^{L}{{{\hat{a}}_{lq}^{{(n)}H} \cdot \frac{1}{2N_{l}}}{\sum\limits_{n = 0}^{N_{l} - 1}{{r\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(n)} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}}} \right\}} -}} \\{{Re}\left\{ {\sum\limits_{q = 1}^{L}{{{\hat{a}}_{lq}^{{(n)}H} \cdot \frac{1}{2N_{l}}}{\sum\limits_{n = 0}^{N_{l} - 1}{{{\hat{r}}^{(n)}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(n)} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}}} \right\}} \\{= {{y_{l}^{(n)}\lbrack m\rbrack} - {{Re}\left\{ {{\sum\limits_{q = 1}^{L}{{{\hat{a}}_{lq}^{{(n)}H} \cdot \frac{1}{2N_{l}}}{\sum\limits_{n = 0}^{N_{l} - 1}{{\hat{r}}^{(n)}\left\lbrack {nN}_{c} \right.}}}} +} \right.}}} \\\left. {{\hat{\tau}}_{lq}^{(n)} + {{mT}_{l}{\rbrack \cdot {c_{lm}^{*}\lbrack n\rbrack}}}} \right\}\end{matrix} & (11)\end{matrix}$and then, combining equation (11) with equation (8) yields:

$\begin{matrix}{{y_{l}^{({n + 1})}\lbrack m\rbrack} = {{A_{l}^{{(n)}2} \cdot {b_{l}^{(n)}\lbrack m\rbrack}} + y_{l}^{(n)} - {{Re}\left\{ {\sum\limits_{q = 1}^{L}{{{\hat{a}}_{lq}^{{(n)}H} \cdot \frac{1}{2N_{l}}}{\sum\limits_{n = 0}^{N_{l} - 1}{{{\hat{r}}^{(n)}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(n)} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}}} \right\}}}} & (12)\end{matrix}$

This embodiment improves the above approaches in that the antennastreams do not need to be input into the multiple user detectionprocess, however, it is not possible to re-estimate the channelamplitudes.

Referring to FIG. 6, an illustration of the matched-filter outputembodiment is illustrated. As illustrated, the processing of thebaseband r[t] waveform is accomplished as described in FIG. 5 above, andfurther, ρ^((n))[t] is determined in accordance with equation (9) and isapplied to the raised-cosine pulse shaping process 602.

Differing from the above embodiment, however, there is no summationprocess before applying {circumflex over (r)}^((n))[t] of the secondrake receiver process 604. Rather, {circumflex over (r)}^((n))[t] isapplied directly to the rake receiver process 604. The output from therake receivers 604 is subtracted 606 from the output y_(l) ^((n))[m]from the first rake receivers 112. This difference is then added to theA_(l) ² ² ·{circumflex over (b)}_(l) ^((n))[m] value to produce y_(l)^((n+1))[m]. This process is described within the above equations (11)and (12).

The computational complexity is reduced because there is no longer anexplicit interference canceling (IC) operation, and thus, theinterference canceling computational cost is zero. The rake receivercomputational cost is half the previous embodiment's value because nowthe re-estimate of the amplitudes cannot be performed, and there is noneed to cancel interference on the dedicated physical control channel(DPCCH). Therefore, the computational cost is:

Process GOPS Re-Spread 31.5 Raised Cosine Filtering 8.8 IC 0.0 FingerReceivers 31.5 TOTAL 71.8

Another embodiment using matched-filter outputs obtained before themaximal ratio combination (MRC) is now described. The pre-MRC rakematched-filter outputs can be described as:

$\begin{matrix}{{y_{lq}^{(0)}\lbrack m\rbrack} = {\frac{1}{2N_{l}}{\sum\limits_{n = 0}^{N_{l} - 1}{{r\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(0)} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}}} & (13)\end{matrix}$

The same detection statistics based on the cleaned up signal {circumflexover (r)}_(l) ^((n+1))[t] is

$\begin{matrix}{{y_{lq}^{({n + 1})}\lbrack m\rbrack} = {\frac{1}{2N_{l}}{\sum\limits_{n = 0}^{N_{l} - 1}{{{\hat{r}}_{l}^{({n + 1})}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(0)} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}}} & (14)\end{matrix}$

Now from Equation (7),{circumflex over (r)} _(l) ^((n+1)) [t]={circumflex over (r)} _(l)^((n)) [t]+r[t]−{circumflex over (r)} ^((n)) [t]  (15)

Hence the first-stage pre-MRC matched-filter outputs can be re-written:

$\begin{matrix}{\begin{matrix}{{y_{lq}^{({n + 1})}\lbrack m\rbrack} = {\frac{1}{2N_{l}}{\sum\limits_{n = 0}^{N_{l} - 1}{{{\hat{r}}_{l}^{({n + 1})}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(0)} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}}} \\{\;{= {{\frac{1}{2N_{l}}{\sum\limits_{n = 0}^{N_{l} - 1}{{{\hat{r}}_{l}^{(n)}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(0)} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}} +}}} \\{{\frac{1}{2N_{l}}{\sum\limits_{n = 0}^{N_{l} - 1}{{r\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(0)} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}} -} \\{\frac{1}{2N_{l}}{\sum\limits_{n = 0}^{N_{l} - 1}{{{\hat{r}}^{(n)}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(0)} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}} \\{= {{{\hat{a}}_{lq}^{(n)} \cdot {{\hat{b}}_{l}^{(n)}\lbrack m\rbrack}} + {y_{lq}^{(n)}\lbrack m\rbrack} - {y_{{est},{lq}}^{(n)}\lbrack m\rbrack}}}\end{matrix}{{y_{{est},{lq}}^{(n)}\lbrack m\rbrack} \equiv {\frac{1}{2N_{l}}{\sum\limits_{n = 0}^{N_{l} - 1}{{{\hat{r}}^{(n)}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(0)} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}}}} & (16)\end{matrix}$where the following approximation has been used,

$\begin{matrix}{{y_{{lq},1}^{({n + 1})}\lbrack m\rbrack} \equiv {\frac{1}{2N_{l}}{\sum\limits_{n = 0}^{N_{l} - 1}{{{\hat{r}}_{l}^{(n)}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq}^{(0)} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}} \cong {{\hat{a}}_{lq}^{(n)} \cdot {{\hat{b}}_{l}^{(n)}\lbrack m\rbrack}}} & (17)\end{matrix}$

Given the pre-MRC matched-filter outputs the re-estimated channelamplitudes are

$\begin{matrix}{{\hat{a}}_{lq}^{({n + 1})} \equiv {\sum\limits_{s}{{{w\lbrack s\rbrack} \cdot \frac{1}{N_{p}}}{\sum\limits_{m = 0}^{N_{p} - 1}{{y_{lq}^{({n + 1})}\left\lbrack {m + {Ms}} \right\rbrack} \cdot {{\hat{b}}_{l}^{(n)}\left\lbrack {m + {Ms}} \right\rbrack}}}}}} & (18)\end{matrix}$wherein

-   -   w[s] is a filter,    -   N_(p) is a number of symbols, and    -   M is a number of symbols per slot,        and the post-MRC matched-filter outputs are then

$\begin{matrix}{{y_{l}^{({n + 1})}\lbrack m\rbrack} \equiv {{Re}\left\{ {\sum\limits_{q = 1}^{L}{{\hat{a}}_{lq}^{{({n + 1})}^{H}} \cdot {y_{lq}^{({n + 1})}\lbrack m\rbrack}}} \right\}}} & (19)\end{matrix}$

This embodiment is illustrated in FIG. 7. Here, the y_(lq) ⁽⁰⁾[m]detection statistics are produced as with the above embodiments,however, before being applied to the MRC, the estimated amplitude{circumflex over (α)}_(lq) ⁽⁰⁾ is determined first. Next, the MRCproduces the y_(l) ⁽⁰⁾[m] detection statistics which are from theamplitudes {circumflex over (α)}_(lq) ⁽⁰⁾ and the pre-combinationmatched-filter detection statistics y_(lq) ⁽⁰⁾[m] as in Equation (19)above.

The {circumflex over (r)}[t] waveform is applied (or reapplied) to arake receiver 704. The output from the rake receiver 704 is subtracted706 from the y_(lq) ⁽⁰⁾[m] detection statistics. Next, the differencefrom the subtraction 706 is summed 708 with the {circumflex over(α)}_(lq) ⁽⁰⁾·{circumflex over (b)}_(l) ⁽⁰⁾[m] value, thus producingy_(lq) ⁽¹⁾[m] in accordance with equation (19) above.

After n iterations are performed, the y_(lq) ^((n)) detection statisticsfor each of the users corresponding to each antenna has been determined.The detection statistics for each user, y_(l) ^((n)) is next determinedvia estimating the complex amplitudes 710 across the Q channels for thatuser, and performing a maximum ratio combination 712 using thoseamplitudes.

It is helpful to understand that although the computational complexityincreased, here, it is possible to re-estimate channel amplitudes, andhence, cancel interference on the dedicated physical control channels(DPCCH). The computational complexity of this embodiment is:

Process GOPS Re-Spread 31.5 Raised Cosine Filtering 8.8 IC 0.0 FingerReceivers 62.9 TOTAL 103.2which is still within a practical range.

Therefore, as shown in all the embodiments above, and othernon-illustrated embodiments, methods for performing multiple userdetection are illustrated.

Turning now to software implementations for the above, one of severalimplementations is designed to allow full or partial decoding of usersat various transmission time intervals (TTIs) within the multiple userdetection (MUD) iterative loop. The approach, illustrated in FIG. 8,allows users belonging to different classes (e.g., voice and data) to beprocessed with different latencies. For example, voice users could beprocessed with a 10+ms latency 802, whereas data users could be processwith an 80+ms latency 804. Alternately, voice users could be processedwith a 20+ms latency 806 or a 40+ms latency 808, so as to include voicedecoding in the MUD loop. Other alternatives are possible depending onthe implementation and limitations of the processing requirements.

If a particular data user is to be processed with an 80+ms latency 804so as to include the full turbo decode within the MUD loop then theinput channel bit-error rate (BER) pertaining to these users might beextraordinarily high. Here, the MUD processing might be configured so asto not include any cancellation of the data users within the 10+mslatency 802. These data users would then be cancelled in the 20+mslatency 806 period. For this cancellation it could be opted to performMUD only on data users. The advantage of canceling the voice users inthe first latency range (e.g., first box) would still benefit the secondlatency range processing.

Alternately, the second box 806 could perform cancellation on both voiceand data users. The reduced voice channel bit-error rate would notbenefit the voice users, whose data has already been shipped out to meetthe latency requirement, but the reduced voice channel BER would improvethe cancellation of voice interference from the data users. In the casethat voice and data users are cancelled in the second box 806, another,possible configuration would be to arrange the boxes in parallel. Otherreduced-latency configurations with mixed serial and parallelarrangements of the processing boxes are also possible.

Depending on the arrangement chosen, the performance for each class ofuser will vary. The approach above tends to balance the propagationrange for data and voice users, and the particular arrangement can bechosen to tailor range for the various voice and data services.

Each box is the same code but configured differently. The parametersthat differ are:

-   -   N_FRAMES_RAKE_OUTPUT;    -   Decoding to be performed (e.g. repetition decoding, turbo        decoding, and the like);    -   Classes of users to be cancelled;    -   Threshold parameters.

The pseudo code for the software implementation of one long-codemultiple user detection processing box is as follows:

Initialize Zero data Generate OVSF codes Generate raised cosine pulseAllocate memory Open rake output files Open mod output files Align moddata Main Frame Loop { Determine number of physical usersRead_in_rake_output_records (N frames) Reformat_rake_output_data (Nframes at a time) for stage = 1:N_stages Perform appropriatedecoding(SRD, turbo, and the like, depending on TTI)Perform_long_code_mud end } Free memory

The following four functions are described below:

-   -   Read_in_rake_output_records;    -   Reformat_rake_output_data    -   Perform appropriate decoding(SRD, turbo, and the like,        “depending on TTI);    -   Perform_long_code_mud.

The Read_in_rake_output_records function performs:

-   -   Reading in data for each user; and    -   Assigning data structure pointers.

The rake data transferred to MUD is associated with structures of typeRake_output_data_type. The elements of this structure are given inTable 1. There is a parameter N_FRAMES_RAKE_OUTPUT with values {1, 2, 4,8} that specifies the number of frames to be read-in at a time. Thefollowing table tabulates the Structure Rake_output_buf_type elements:

Element Type Name unsigned long Frame_number unsigned longphysical_user_code_number int physical_user_tfci int physical_user_sfint physical_user_beta_c int physical_user_beta_d int N_dpdchs intcompressed_mode_flag int compressed_mode_frame int N_first int TGL intslot_format int N_rake_fingers int N_antennas unsigned longmpath_offset[N_ANTENNAS] unsigned long tau_offset unsigned long y_offsetCOMPLEX* mpath[N_ANTENNAS] unsigned long* tau_hat float * y_data

It is helpful to describe several structure elements for a completeunderstanding. The element slot_format is an integer from 0 to 11representing the row in the following table (DPCCH fields), 3GPP TS25.211. By way of non-limiting example, when slot_format=3, it maps tothe fourth row in the table corresponding to slot format 1 with 8 pilotbits and 2 TPC bits. The offset values (e.g. tau_offset) give thelocation in memory relative to the top of the structure where thecorresponding data is stored. These offset values are used for settingthe corresponding pointers (e.g. tau_hat). For example, if Rbuf is apointer to the structure then:Rbuf→tau _(—) hat=(unsigned long*)((unsigned long)Rbuf+Rbuf→tau_offset);is used to set the tau_hat pointer.

The rake output structure associated data (mpath, tau_hat and y_data) isordered as follows:mpath[n][q+s*L]=amplitude datatau_(—) hat[q]=delay datay_data[0+m*M]=DPCCH data for symbol period my_data[1+j+(d−1)*J+m*M]=dth DPDCH data for symbol period mwhere

-   -   n=antenna index (0: Na−1)    -   q=finger index (0: L−1)    -   s=slot index (0: Nslots−1)    -   m=symbol index (0: 149)    -   j=bit index (0: J−1)    -   d=DPDCH index (1: Ndpdchs)    -   Na=N_ANTENNAS    -   L=N_RAKE_FINGERS_MAX    -   Nslots=N_SLOTS_PER_FRAME=15    -   J=256/SF    -   M=1+J*Ndpdchs.

The memory required for the rake output buffers is dominated by they-data memory requirement. The maximum memory requirement for a user isNsym*(1+64*6) floats per frame, where Nsym=150 is the number of symbolsper frame. This corresponds to 1 DPCCH at SF 256 and 6 DPDCHs at SF 4.If 128 users are all allocated to this memory then possible memoryproblems arise. To minimize allocation problems, the following tablegives the maximum number of user that the MUD implementation will bedesigned to handle at a given SF and Ndpdchs.

Number Bits per Mean bits SF Ndpdchs users symbol per symbol 256 1 256 24.0 128 1 192 3 4.5 64 1 128 5 5.0 32 1 96 9 6.8 16 1 64 17 8.5 8 1 3233 8.3 4 1 16 65 8.1 4 2 12 129 12.1 4 3 8 193 12.1 4 4 4 257 8.0 4 5 3321 7.5 4 6 2 385 6.0

In the proceeding table, the Bits per symbol=1+(256/SF)*N_DPDCHs, Meanbits per symbol=(Number users)*(Bits per symbol)/128, and Ndpdchs=NumberDPDCHs.

From the above table it is noted that the parameter specifying the meannumber of bits per symbol be set to MEAN_BITS_PER_SYMBOL=16. The codechecks to see if the physical users specifications are consistent withthis memory allocation. Given this specification, the following areestimates for the memory required for the rake output buffers.

Data Type Size Count Count Bytes Rake_output_buf Structure 88 1 1 88mpath COMPLEX 8 Lmax * Nslots * Na 240 1,920 tau int 4 Lmax 8 32 y float4 Nsym * Nbits 2400 9,600 y_(lq) COMPLEX 8 Nsym * Nbits * Lmax * Na307,200 Total bytes per user per frame 318,840 Total bytes for 128 usersand 9 frames 367 Mbytes

Where Count is the per physical user per frame, assuming numeric valuesbased on:

Lmax = N_RAKE_FINGERS_MAX = 8 Nslots = N_SLOTS_PER_FRAME = 15 Na =N_ANTENNAS = 2 Nsym = N_SYMBOLS_PER_FRAME = 150 Nbits =MEAN_BITS_PER_SYMBOL = 16

The location of each structure is stored in an array of pointersRake_output_(—) buf[User+Frame_(—) idx*N_USERS_MAX]where Frame_idx varies from 0 to N_FRAMES_RAKE_OUTPUT inclusive. Frame 0is initially set with zero data. After all frames are processed, thestructure and data corresponding to the last frame is copied back toframe 0 and N_FRAMES_RAKE_OUTPUT new structures and data are read fromthe input source.

The Reformat_rake_output_data function performs:

-   -   Combining of multi-path data across frame boundaries;    -   Determines number of rake fingers for each MUD processing frame    -   Filling virtual-user data structures    -   Separates DPCHs into virtual users    -   Determines chip and sub-chip delays for all fingers    -   Determines the minimum SF and maximum number of DPDCHs for each        user    -   Reformats user b-data to correspond to the minimum SF    -   Reformats rake data to be linear and contiguous in memory.

Interference cancellation is performed over MUD processing frames. Dueto multi-path and asynchronous users, the MUD processing frame will notcorrespond exactly with the user frames. MUD processing frames, however,are defined so as to correspond as closely as possible to user frames.It is preferable for MUD processing that the number of multi-pathreturns be constant across MUD processing frames. The function ofmulti-path combining is to format the multi-path data so that it appearsconstant to the long-code MUD processing function. Each time afterN=N_FRAMES_RAKE_OUTPUT frames of data is read from the input source thecombining function is called.

FIG. 9 shows a hypothetical set of multi-path lags corresponding toseveral frames of user data 902. Also shown are the corresponding MUDprocessing frames 904. Notice that MUD processing frame k overlaps withuser frames k−1 and k. For example, processing frame 1 906 overlaps withuser frame 0 908, and further, overlaps with user frame 1 910. The MUDprocessing frame is positioned so that this is true for all multi-pathsof all users. A one-symbol period corresponds to a round trip for a 10km radius cell. Hence even large cells are typically only a few symbolsasynchronous.

The multi-path combining function determines all distinct delay lagsfrom user frames k−1 and k. Each of these lags is assigned as a distinctmulti-path associated with MUD processing frame k, even if some of thedistinct lags are obviously the same finger displaced in delay due tochannel dynamics. The amplitude data for a finger that extends into aframe where the finger wasn't present is set to zero. The illustratedthin lag-lines (e.g., 912) represent finger amplitude data that is setto zero. After the tentative number of fingers is assessed in this way,the total finger energy that falls within the MUD processing frame isassessed for each tentative finger and the top N_RAKE_FINGERS_MAXfingers are assigned. In the assignment of fingers the finger indicesfor fingers that were active in the previous MUD processing frame arekept the same so as not to drop data.

The user SF and number of DPDCHs can change every frame. It is helpfulfor efficient MUD processing that the user SF and number of DPDCHs beconstant across MUD processing frames. This function,Reformat_rake_output_data formats the user b-data so that it appearsconstant to the long-code MUD processing function. Each time afterN=N_FRAMES_RAKE_OUTPUT frames of data is read from the input source thisfunction is called. The function scans the N frames of rake output dataand determined for each user the minimum SF and maximum number ofDPDCHs. Virtual users are assigned according to the maximum number ofDPCHs. If for a given frame the user has fewer DPCH the correspondingb-data and a-data are set to zero.

Note that this also applies to the case where the number of DPDCHs iszero due to inactive users, and also to the case where the number ofDPCHs is zero due to compressed mode. It is anticipated that thecondition of multiple DPDCHs will not often arise due to the extreme useof spectrum. If for a given frame the SF is greater than the minimum theb-data is expanded to correspond to the lower SF. That is, for example,if the minimum SF is 4, but over some frames the SF is 8, then each SF-8b-data bit is replicated twice so as to look like SF-4 data. Before themaximum ration combination (MRC) operation the y-data corresponding toexpanded b-data is averaged to yield the proper SF-8 y-data.

FIG. 10 shows how rake output data is mapped to (virtual) user datastructures. Each small box (e.g., 1002) in the figure represents aslot's-worth of data. For DPCCH y-data or b-data, for example, each boxwould represent 150 values. Data is mapped so as to be linear in memoryand contiguous frame to frame for each antenna and each finger. Thereason for this mapping is that data can easily be accessed by adjustinga pointer. A similar mapping is used for other data except the amplitudedata, where it would be imprudent to attempt to keep the number offingers constant over a time period of up to 8 frames. For thevirtual-user code data there are generally 38,400 data items per frame;and for the b-data and y-data there are generally 150×256/SF data itemsper frame.

Note that for pre-MRC y-data, the mapping is linear and contiguous inmemory for each antenna and each finger. Each DPCH is mapped to aseparate virtual user data structure. The initial conditions data (frame0 1004) is initially filled with zero data (except for the codes). Afterframe N data is written, this data is copied back to frame 0 1004, andthe next frame of data that is written is written to frame 1 1006. Forall data types the 0-index points to the first data item written toframe 0 1004. For example, the initial-condition b-data (frame 0) for anSF 256 virtual user is indexed b[0], b[1], . . . , b[149], and theb-data corresponding to frame 1 is b[150], b[151], . . . , b[299].

Four indices are of interest: chip index, bit index, symbol index, andslot index. The chip index r is always positive. All indices are relatedto the chip index. That is, for chip index r we have

-   -   Chip index=r    -   Bit index=r/Nk    -   Symbol index=r/256    -   Slot index=r/2560        where Nk is the spreading factor for virtual user k.

The elements for the (virtual) user data structures are given in thefollowing table along with the memory requirements.

Element Type Name Bytes Bytes int Dpch_type 4 4 int Sf 4 4 int log2Sf 44 float Beta 4 4 int Mrc_bit_idx 4 4 int N_bits_per_dpch 4 4 intN_rake_fingers[Nf] 4*8 32 int Chip_idx_rs[Lmax] 4*8 32 intChip_idx_ds[Lmax] 4*8 32 int Delay_lag[Lmax] 4*8 32 intfinger_idx_max_lag 4 4 int Chip_delay[Lmax] 4*8 32 intSub_chip_delay[Lmax] 4*8 32 COMPLEX axcode[Nf][Na][Lmax][Nslots * 2][4]8*8*2*8*15*2*4 122880 COMPLEX a_hat_ds[Nf][Na][Lmax][Nslots * 2]8*8*2*8*15*2 30720 COMPLEX* mf_ylq[Na][Lmax] 4*2*8 64 COMPLEX*mud_ylq[Na][Lmax] 4*2*8 64 float* mf_y_data 4 4 float* mud_y_data 4 4char* mf_b_data 4 4 char* mud_b_data 4 4 char* mod_b_data 4 4 charCode[Nchips * (1+Nf)] 1*38400*9 345600 COMPLEX mud_ylq_save[Na][Lmax]8*2*8 128 int Mrc_bit_idx_save 4 4 float Repetition_rate 4 4 COMPLEX1,2mf_ylq[Na][Lmax][Nbits1 * (1 +Nf)] 8*2*8*1200*9 1382400 COMPLEX1,2mud_ylq[Na][Lmax][Nbits1 *(1+Nf)] 8*2*8*1200*9 1382400 float1,2mf_y_data[Nbits1 * (1+Nf)] 4*1200*9 43200 float1,2 mud_y_data[Nbits1 *(1+Nf)] 4*1200*9 43200 char(1,2) mf_b_data[Nbits1 * (1+Nf)] 1*1200*910800 char(1,2) mud_b_data[Nbits1 * (1+Nf)] 1*1200*9 10800 char(1,2)mod_b_data[Nbits1 * (1+Nf)] 1*1200*9 10800 Total 3,383,304 x 256 v-users866 Mbytes OLD: COMPLEX Code[Nchips * 2] 8*38400*2 614400 where thefollowing notations are defined: 1 - Associated data, not explicitlypart of structure 2 - Based on 8 bits per symbol on average Lmax =N_RAKE_FINGERS_MAX = 8 Na = N_ANTENNAS = 2 Nslots = N_SLOTS_PER_FRAME =15 (Nbitsmax1 = N_BITS_PER_FRAME_MAX_1 = 9600) Nchips =N_CHIPS_PER_FRAME = 38400 Nf = N_FRAMES_RAKE_OUTPUT = 8 Nbits1 =MEAN_BITS_PER_FRAME_1 = 150*4.25 ~= 640.

Each user class has a specified decoding to be performed. The decodingcan be:

-   -   None    -   Soft Repetition Decoding (SRD)    -   Turbo decoding    -   Convolutional decoding.

All decoding is Soft-Input Soft-Output (SISO) decoding. For example, anSF 64 voice user produces 600 soft bits per frame. Thus 1,200 soft bitsper 20 ms transmission time intervals (TTIs) are produced. These 1,200soft bits are input to a SISO de-multiplex and convolution decodingfunction that outputs 1,200 soft bits. The SISO de-multiplex andconvolution decoding function reduces the channel bit error rate (BER)and hence improve MUD performance. Since data is linear in memory noreformatting of data is necessary and the operation can be performedin-place. If further decoders are included, reduced complexitypartial-decode variants can be employed to reduce complexity. For turbodecoding, for example, the number of iterations may be limited to asmall number.

The Long-code MUD performs the following operations:

-   -   Respread    -   Raised-Cosine Filtering    -   Despread    -   Maximal-Ratio Combining (MRC).

The re-spread function calculates r[t] given by

$\begin{matrix}{{\rho\lbrack t\rbrack} \equiv {\sum\limits_{k = 0}^{K_{v} - 1}{\sum\limits_{p = 0}^{L - 1}{\sum\limits_{r}{{\delta\left\lbrack {t - {\hat{\tau}}_{kp} - {rN}_{c}} \right\rbrack} \cdot {{\hat{a}}_{kp}\left\lbrack {r/2560} \right\rbrack} \cdot {c_{k}\lbrack r\rbrack} \cdot {{\hat{b}}_{k}\left\lbrack {r/N_{k}} \right\rbrack}}}}}} & (20)\end{matrix}$

The function r[t] is calculated over the interval t=0: Nf*M*Nc−1, whereM=38400 is the number of chips per frame and Nf is the number of framesprocessed at a time. The actual function calculated isρ_(m) [t]≡ρ[t+mN _(c) N _(chips)]t=0: N _(c) N _(chips)−1  (21)which represents a section of the waveform of length Nchips chips, andthe calculation is performed for m=0: Nf*M*Nc/Nchips−1. The function isdefined (and allocated) for negative indices −(Lg−1): −1, representingthe initial conditions which are set to zero at start-up. The parameterLg is the length of the raised-cosine filter discussed below.

Note that every finger of every user adds one and only one non-zerocontribution per chip within this interval corresponding to chip indicesr. Given the delay lag tlq for the qth finger of the lth user we candetermine which chip indices r contribute to a given interval. To thisend definet=nN _(c) +q, 0≦q<N _(c){circumflex over (τ)}_(kp) ≡n _(kp) N _(c) +q _(kp), 0≦q _(kp) <N_(c)  (22)

The first definition defines t as belonging to the nth chip interval;the second is a decomposition of the delay lag into chip delay andsub-chip delay. Given the above we can solve for r and q usingr=n−n _(kp)q=q _(kp)  (23)

Notice that chip indices r as given above can be negative. In theimplementation the pointers {circumflex over (α)}_(kp), c_(k) and{circumflex over (b)}_(k) point to the first element of frame 1 1006(FIG. 10).

The repeated amplitude-code multiplies are avoided by using:

$\begin{matrix}{{{{\left( {\hat{a} \cdot c} \right)_{k\; p}\lbrack s\rbrack}\left\lbrack {c_{k}\lbrack r\rbrack} \right\rbrack} \equiv {{{\hat{a}}_{kp}\lbrack s\rbrack} \cdot {c_{k}\lbrack r\rbrack}}}{{{\left( {\hat{a} \cdot c} \right)_{k\; p}\lbrack s\rbrack}\lbrack c\rbrack} \equiv \left\{ {{\begin{matrix}{{{{\hat{a}}_{kp}\lbrack s\rbrack} \cdot \left( {{+ 1} + j} \right)},{c = 0}} \\{{{{\hat{a}}_{kp}\lbrack s\rbrack} \cdot \left( {{- 1} + j} \right)},{c = 1}} \\{{{{\hat{a}}_{kp}\lbrack s\rbrack} \cdot \left( {{- 1} - j} \right)},{c = 2}} \\{{{{\hat{a}}_{kp}\lbrack s\rbrack} \cdot \left( {{+ 1} - j} \right)},{c = 3}}\end{matrix}{c_{k}\lbrack r\rbrack}} \equiv \left\{ \begin{matrix}{0,{{c_{k}\lbrack r\rbrack} = {{+ 1} + j}}} \\{1,{{c_{k}\lbrack r\rbrack} = {{- 1} + j}}} \\{2,{{c_{k}\lbrack r\rbrack} = {{- 1} - j}}} \\{3,{{c_{k}\lbrack r\rbrack} = {{+ 1} - j}}}\end{matrix} \right.} \right.}} & (24)\end{matrix}$

The raised-cosine filtering operation applied to the re-spread signalr[t] produces an estimate of the received signal given by:

$\begin{matrix}{{\hat{r}\lbrack t\rbrack} = {\sum\limits_{t^{\prime} = 0}^{L_{g} - 1}{{g\left\lbrack t^{\prime} \right\rbrack} \cdot {\rho\left\lbrack {t - t^{\prime}} \right\rbrack}}}} & (25)\end{matrix}$where

-   -   g[t] is the raised-cosine pulse and    -   t=0: Nc*Nchips−1    -   t'=0: Lg−1    -   Lg=Nsamples−rc (length of raised-cosine filter)

For example, if an impulse at t=0 is passed through the above filter theoutput is g[t]. The position of the maximum of the filter then specifiesthe delay through filter. The delay is relevant since it specifies thesynchronization information necessary for subsequent despreading. Theraised cosine filter is calculated over the time period n=(n1:n2 )/Nc,where Nc is the number of samples per chip, and time is in chips. Notethat nl is negative, and the position of the maximum of the filter is atn=0. The length of the filter is then Lg=n2−n1, and the maximum occursat sample nl. The delay is thus nl samples, and the chip delay is nl/Ncchips. For simplicity of implementation nl is required to be a multipleof Nc.

The de-spread operation calculates the pre-MRC detection statisticscorresponding to the estimate of the received signal:

$\begin{matrix}{{y_{{est},{lq}}^{(1)}\lbrack m\rbrack} \equiv {\frac{1}{2N_{l}}{\sum\limits_{n = 0}^{N_{l} - 1}{{\hat{r}\left\lbrack {{nN}_{c} + {\hat{\tau}}_{lq} + {mT}_{l}} \right\rbrack} \cdot {c_{lm}^{*}\lbrack n\rbrack}}}}} & (26)\end{matrix}$

Prior to the MRC operation, the MUD pre-MRC detection statistics arecalculated according to:y _(lq) ⁽¹⁾ [m]={circumflex over (α)} _(lq) ·{circumflex over (b)} _(l)[m]+y _(lq) ⁽⁰⁾ [m]−y _(est,lq) ⁽¹⁾ [m]  (27)

These are then combined with antenna amplitudes to form the post-MRCdetection statistics:

$\begin{matrix}{{y_{l}^{(1)}\lbrack m\rbrack} \equiv {{Re}\left\{ {\sum\limits_{q = 1}^{L}{{\hat{a}}_{lq}^{{(1)}^{H}} \cdot {y_{lq}^{(1)}\lbrack m\rbrack}}} \right\}}} & (28)\end{matrix}$

Multiuser detection systems in accord with the foregoing embodiments canbe implemented in any variety of general or special purpose hardwareand/or software devices. FIG. 11 depicts one such implementation. Inthis embodiment, each frame of data is processed three times by the MUDprocessing card 118 (or, “MUD processor” for short), although it can berecognized that multiple such cards could be employed instead (or inaddition) for th is purposed During the first pass, only the controlchannels are respread which the maximum ratio combination (MRC) and MUDprocessing is performed on the data channels. During subsequent passes,data channels are processed exclusively, with new y (i.e., softdecisions) and b (i.e., hard decisions) data being generated as shown inthe diagram.

Amplitude ratios and amplitudes are determined via the DSP (e.g.,element 900, or a DSP otherwise coupled with the processor board 118 andreceiver 110), as well as certain waveform statistics. These values(e.g., matrices and vectors) are used by the MUD processor in variousways. The MUD processor is decomposed into four stages that closelymatch the structure of the software simulation: Alpha Calculation andRespread 1302, raised-cosine filtering 1304, de-spreading 1306, and MRC1308. Each pass through the MUD processor is equivalent to oneprocessing stage of the implementations discussed above. The design ispipelined and “parallelized.” In the illustrated embodiment, the clockspeed can be 132 MHz resulting in a throughput of 2.33 ms/frame,however, the clock rate and throughput varies depending on therequirements. The illustrated embodiment allows for three-pass MUDprocessing with additional overhead from external processing, resultingin a 4-times real-time processing throughput.

The alpha calculation and respread operations 1302 are carried out by aset of thirty-two processing elements arranged in parallel. These can beprocessing elements within an ASIC, FPGA, PLD or other such device, forexample. Each processing element processes two users of four fingerseach. Values for b are stored in a double-buffered lookup table. Valuesof â and j{circumflex over (α)} are pre-multiplied with beta by anexternal processor and stored in a quad-buffered lookup table. The alphacalculation state generated the following values for each finger, wheresubscripts indicate antenna identifier:α₀=β₀·(C·{circumflex over (α)} ₀ −jC·j{circumflex over (α)} ₀)jα ₀=β₀·(jC·{circumflex over (α)} ₀ +C·j{circumflex over (α)} ₀)α1=β₁·(C·{circumflex over (α)} ₁ −jC·j{circumflex over (α)} ₁)jα ₁=β₁·(jC·{circumflex over (α)} ₁ +C·j{circumflex over (α)} ₁)

These values are accumulated during the serial processing cycle intofour independent 8-times oversampling buffers. There are eight memoryelements in each buffer and the element used is determined by thesub-chip delay setting for each finger.

Once eight fingers have been accumulated into the oversampling buffer,the data is passed into set of four independent adder-trees. Theseadder-trees each termination in a single output, completing the respreadoperation. The four raised-cosine filters 1304 convolve the alpha datawith a set of weights determined by the following equation:

${g_{rc}(t)} = \frac{{\sin\left( {\pi\frac{1}{t}} \right)} \cdot {\cos\left( {{\alpha\pi}\frac{1}{T}} \right)}}{\pi\frac{1}{t}\left( {1 - \left( {2\alpha\frac{1}{T}} \right)^{2}} \right)}$

The filters can be implemented with 97 taps with odd symmetry. Thefilters illustrated run at 8-times the chip rate, however, other ratesare possible. The filters can be implemented in a variety of computeelements 220, or other devices such as ASICs, FPGAs for example.

The despread function 1306 can be performed by a set of thirty-twoprocessing elements arranged in parallel. Each processing elementserially processes two users of four fingers each. For each finger, onechip value out of eight, selected based on the sub-chip delay, isaccepted from the output of the raised-cosine filter. The despread stateperforms the following calculations for each finger (subscripts indicateantenna):

$\begin{matrix}{y_{0} = {{\sum\limits_{0}^{{SF} - 1}{C \cdot r_{0}}} + {{jC} \cdot {jr}_{0}}}} \\{{jy}_{0} = {{\sum\limits_{0}^{{SF} - 1}{C \cdot {jr}_{0}}} - {{jC} \cdot r_{0}}}} \\{y_{1} = {{\sum\limits_{0}^{{SF} - 1}{C \cdot r_{1}}} + {{jC} \cdot {jr}_{1}}}} \\{{jy}_{1} = {{\sum\limits_{0}^{{SF} - 1}{C \cdot {jr}_{1}}} - {{jC} \cdot r_{1}}}}\end{matrix}$

The MRC operations are carried out by a set of four processing elementsarranged in parallel, such as the compute elements 220 for example. Eachprocessor is capable of serially processing eight users of four fingerseach. Values for y are stored in a double-buffered lookup table. Valuesfor b are derived from the MSB of the y data. Note that the b data usedin the MUD stage is independent of the b data used in the respreadstage. Values of {circumflex over (α)} and j{circumflex over (α)}< arepre-multiplied with β by an external processor and stored in aquad-buffered lookup table. Also, Σ({circumflex over (α)}²+j{circumflexover (α)}²) for each channel is stored in a quad-buffered table.

The output stage contains a set of sequential destination bufferpointers for each channel. The data generated by each channel, on a slotbasis, is transferred to the crossbar (or other interconnect)destination indicated by these buffers. The first word of each of thesetransfers will contain a counter in the lower sixteen bits indicatinghow many y values were generated. The upper sixteen bits will containthe constant value 0xAA55. This will allow the DSP to avoid interruptsby scanning the first word of each buffer. In addition, the DSP_UPDATEregister contains a pointer to single crossbar location. Each time aslot or channel data is transmitted, an internal counter is written tothis location. The counter is limited to 10 bits and will wrap aroundwith a terminal count value of 1023.

The method of operation for the long-code multiple user detectionalgorithm (LCMUD) is as follows. Spread factor for four-channelsrequires significant amount of data transfer. In order to limit the gatecount of the hardware implementation, processing an SF4 channel canresult in reduced capability.

A SF4 user can be processed on certain hardware channels. When one ofthese special channels is operating on an SF4 user, the next threechannels are disabled and are therefore unavailable for processing. Thisrelationship is as shown in the following table:

SF4 Chan Disabled Channels SF4 Chan Disabled Channels 0 1, 2, 3 32 33,34, 35 4 5, 6, 7 36 37, 38, 39 8  9, 10, 11 40 41, 42, 43 12 12, 14, 1544 45, 46, 47 16 17, 18, 19 48 49, 50, 51 20 21, 22, 23 52 53, 54, 55 2425, 26, 27 56 57, 58, 59 28 29, 30, 31 60 61, 62, 63

The default y and b data buffers do not contain enough space for SF4data. When a channel is operating on SF4 data, the y and b buffersextend into the space of the next channel in sequence. For example, ifchannel 0 is processing SF data, the channel 0 and channel 1 b buffersare merged into a single large buffer of 0×40 32-bit words. The ybuffers are merged similarly.

In typical operation, the first pass of the LCMUD algorithm willrespread the control channels in order to remove control interference.For this pass, the b data for the control channels should be loaded intoBLUT while the y data for data channels should be loaded into YDEC. Eachchannel should be configured to operate at the spread factor of the datachannel stored into the YDEC table.

Control channels are always operated at SF 256, so it is likely that thecontrol data will need to be replicated to match the data channel spreadfactor. For example, each bit (b entry) of control data would bereplicated 64 times if that control channel were associated with an SF 4data channel.

Each finger in a channel arrives at the receiver with a different delay.During the Respread operation, this skew among the fingers is recreated.During the MRC stage of MUD processing, it is necessary to remove thisskew and realign the fingers of each channel. This is accomplished inthe MUD processor by determining the first bit available from the mostdelayed finger and discarding all previous bits from all other fingers.The number of bits to discard can be individually programmed for eachfinger with the Discard field of the MUD-PARAM registers. This operationwill typically result in a ‘short’ first slot of data. This isunavoidable when the MUD processor is first initialized and should notcreate any significant problems. The entire first slot of data can becompletely discarded if ‘short’ slots are undesirable.

A similar situation will arise each time processing is begun on a frameof data. To avoid losing data, it is recommended that a partial slot ofdata from the previous frame be overlapped with the new frame. Trimmingany redundant bits created this way can be accomplished with the Discardregister setting or in the system DSP. In order to limit memoryrequirements, the LCMUD FPGA processes one slot of data at a time.Doubling buffering is used for b and y data so that processing cancontinue as data is streamed in. Filling these buffers is complicated bythe skew that exists among fingers in a channel.

FIG. 12 illustrates the skew relationship among fingers in a channel andamong the channels themselves. The illustrated embodiment allows for 20us (77.8 chips) of skew among fingers in a channel and certain skewamong channels, however, in other embodiments these skew allowancesvary.

There are three related problems that are introduced by skew:Identifying frame & slot boundaries, populating b and y tables andchanging channel constants. Because every finger of every channel canarrive at a different time, there are no universal frame and slotboundaries. The DSP must select an arbitrary reference point. The datastored in b & y tables is likely to come from two adjacent slots.

Because skew exists among fingers in a channel, it is not enough topopulate the b & y tables with 2,560 sequential chips of data. Theremust be some data overlap between buffers to allow lagging channels toaccess “old” data. The amount of overlap can be calculated dynamicallyor fixed at some number greater than 78 and divisible by four (e.g. 80chips). The starting point for each register is determined by the ChipAdvance field of the MUDPARAM register.

A related problem is created by the significant skew among channels. Ascan be seen in FIG. 12, Channel 0 is receiving Slot 0 while Channel 1 isreceiving Slot 2. The DSP must take this skew into account whengenerating the b and y tables and temporally align channel data.

Selecting an arbitrary “slot” of data from a channel implies thatchannel constants tied to the physical slot boundaries may change whileprocessing the arbitrary slot. The Constant Advance field of theMUDPARAM register is used to indicate when these constants shouldchange. Registers affected this way are quad-buffered. Before dataprocessing begins, at least two of these buffers should be initialized.During normal operation, one additional buffer is initialized for eachslot processed. This system guarantees that valid constants data willalways be available.

The following two tables shown the long-code MUD FPGA memory map andcontrol/status register:

Start Addr End Addr Name Description 0000_0000 0000_0000 CSR Control &Status Register 0000_0008 0000_000C DSP_UPDATE Route & Address for DSPupdating 0001_0000 0001_FFFF MUDPARAM MUD Parameters 0002_0000 0002_FFFFCODE Spreading Codes 0003_0000 0004_FFFF BLUT Respread: b Lookup Table0005_0000 0005_FFFF BETA_A Respread: Beta * a_hat Lookup Table 0006_00000007_FFFF YDEC MUD & MRC: y Lookup Table 0008_0000 0008_FFFF ASQ MUD &MRC: Sum a_hat squared LUT 000A_0000 000A_FFFF OUTPUT Output Routes &Addresses

Bit 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 Name Reserved R/W ROReset X X X X X X X X X X X X X X X X Bit 15 14 13 12 11 10 9 8 7 6 5 43 2 1 0 Name Reserved YB CBUF A1 A0 R1 R0 Lst Rst R/W RO RO RO RO RO RwRw Rw Rw Reset X X X X X X X 0 0 0 0 0 0 0 0 0

The register YB indicates which of two y and b buffers are in use. Ifthe system is currently not processing, YB indicates the buffer thatwill be used when processing is initiated.

CBUF indicates which of four round-robin buffers for MUD constants (a^beta) is currently in use. Finger skew will result in some fingers usinga buffer one in advance of this indicator. To guarantee that valid datais always available, two full buffers should be initialized beforeoperation begins. If the system is currently not processing, CBUFindicates the buffer that will be used when processing is restarted. Itis technically possible to indicate precisely which buffer is in use foreach finger in both the Respread and Despread processing stages.However, this would require thirty-two 32-bit registers. Implementingthese registers would be costly, and the information is of little value.

A1 and A0 indicate which y and b buffers are currently being processed.A1 and A0 will never indicate ‘1’ at the same time. An indication of ‘0’for both A1 and A0 means that MUD processor is idle. R1 and R0 arewritable fields that indicate to the MUD processor that data isavailable. R1 corresponds to y and b buffer 1 and R0 corresponds to yand b buffer 0. Writing a ‘1’ into the correct register will initiateMUD processing. Note that these buffers follow strict round-robinordering. The YB register indicates which buffer should be activatednext.

These registers will be automatically reset to ‘0’ by the MUD hardwareonce processing is completed. It is not possible for the externalprocessor to force a ‘0’ into these registers. A ‘1’ in this bitindicates that this is the last slot of data in a frame. Once allavailable data for the slot has been processed, the output buffers willbe flushed. A ‘1’ in this bit will place the MUD processor into a resetstate. The external processor must manually bring the MUD processor outof reset by writing a ‘0’ into this bit.

DSP_UPDATE is arranged as two 32-bit registers. A RACEway™ route to theMUD DSP is stored at address 0x0000_(—)0008. A pointer to a statusmemory buffer is located at address 0x0000_(—)000C. Each time the MUDprocessor writes a slot of channel data to a completion buffer, anincrementing count value is written to this address. The counter isfixed at 10 bits and will wrap around after a terminal count of 1023.

A quad-buffered version of the MUD parameter control register exists foreach finger to be processed. Execution begins with buffer 0 andcontinues in round-robin fashion. These buffers are used insynchronization with the MUD constants (Beta * a_hat, etc.) buffers.Each finger is provided with an independent register to allowindependent switching of constant values at slot and frame boundaries.The following table shows offsets for each MUD channel:

Offset User 0x0000 0 0x0040 1 0x0080 2 0x00C0 3 0x0100 4 0x0140 5 0x01806 0x01C0 7 0x0200 8 0x0240 9 0x0280 10 0x02C0 11 0x0300 12 0x0340 130x0380 14 0x03C0 15 0x0400 16 0x0440 17 0x0480 18 0x04C0 19 0x0500 200x0540 21 0x0580 22 0x05C0 23 0x0600 24 0x0640 25 0x0680 26 0x06C0 270x0700 28 0x0740 29 0x0780 30 0x07C0 31 0x0800 32 0x0840 33 0x0880 340x08C0 35 0x0900 36 0x0940 37 0x0980 38 0x09C0 39 0x0A00 40 0x0A40 410x0A80 42 0x0AC0 43 0x0B00 44 0x0B40 45 0x0B80 46 0x0BC0 47 0x0C00 480x0C40 49 0x0C80 50 0x0CC0 51 0x0D00 52 0x0D40 53 0x0D80 54 0x0DC0 550x0E00 56 0x0E40 57 0x0E80 58 0x0EC0 59 0x0F00 60 0x0F40 61 0x0F80 620x0FC0 63

The following table shows buffer offsets within each channel:

Offset Finger Buffer 0x0000 0 0 0x0004 1 0x0008 2 0x000C 3 0x0010 1 00x0014 1 0x0018 2 0x001C 3 0x0020 2 0 0x0024 1 0x0028 2 0x002C 3 0x00303 0 0x0034 1 0x0038 2 0x003C 3

The following table shown details of the control register:

Bit 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 Name Spread FactorSubchip Delay Discard R/W RW RW RW Reset X X X X X X X X X X X X X X X XBit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Name Chip Advance ConstantAdvance R/W RW RW Reset X X X X X X X X X X X X X X X X

The spread factor field determines how many chip samples are used togenerate a data bit. In the illustrated embodiment, all fingers in achannel have the same spread factor setting, however, it can beappreciated by one skilled in the art that such constant factor settingcan be variable in other embodiments. The spread factor is encoded intoa 3-bit value as shown in the following table:

SF Factor Spread Factor 000 256 001 128 010 64 011 32 100 16 101 8 110 4111 RESERVED

The field specifies the sub-chip delay for the finger. It is used toselect one of eight accumulation buffers prior to summing all Alphavalues and passing them into the raised-cosine filter. Discarddetermines how many MUD-processed soft decisions (y values) to discardat the start of processing. This is done so that the first y value fromeach finger corresponds to the same bit. After the first slot of data isprocessed, the Discard field should be set to zero.

The behavior of the discard field is different than that of otherregister fields. Once a non-zero discard setting is detected, any newdiscard settings from switching to a new table entry are ignored untilthe current discard count reaches zero. After the count reaches zero, anew discard setting may be loaded the next time a new table entry isaccessed.

All fingers within a channel will arrive at the receiver with differentdelays. Chip Advance is used to recreate this signal skew during theRespread operation. Y and b buffers are arranged with older dataoccupying lower memory addresses. Therefore, the finger with theearliest arrival time has the highest value of chip advance. ChipAdvanced need not be a multiple of Spread Factor.

Constant advance indicates on which chip this finger should switch to anew set of constants (e.g. a^) and a new control register setting. Notethat the new values take effect on the chip after the value stored here.For example, a value of 0x0 would cause the new constants to take effecton chip 1. A value of 0xFF would cause the new constants to take effecton chip 0 of the next slot. The b lookup tables are arranged as shown inthe following table. B values each occupy two bits of memory, althoughonly the LSB is utilized by LCMUD hardware.

Offset Buffer 0x0000 U0 B0 0x0020 U1 B0 0x0040 U0 B1 0x0060 U1 B1 0x0080U2 B0 0x00A0 U3 B0 0x00C0 U2 B1 0x00E0 U3 B1 0x0100 U4 B0 0x0120 U5 B00x0140 U4 B1 0x0160 U5 B1 0x0180 U6 B0 0x01A0 U7 B0 0x01C0 U6 B1 0x01E0U7 B1 0x0200 U8 B0 0x0220 U9 B0 0x0240 U8 B1 0x0260 U9 B1 0x0280 U10 B00x02A0 U11 B0 0x02C0 U10 B1 0x02E0 U11 B1 0x0300 U12 B0 0x0320 U13 B00x0340 U12 B1 0x0360 U13 B1 0x0380 U14 B0 0x03A0 U15 B0 0x03C0 U14 B10x03E0 U15 B1 0x0400 U16 B0 0x0420 U17 B0 0x0440 U16 B1 0x0460 U17 B10x0480 U18 B0 0x04A0 U19 B0 0x04C0 U18 B1 0x04E0 U19 B1 0x0500 U20 B00x0520 U21 B0 0x0540 U20 B1 0x0560 U21 B1 0x0580 U22 B0 0x05A0 U23 B00x05C0 U22 B1 0x05E0 U23 B1 0x0600 U24 B0 0x0620 U25 B0 0x0640 U24 B10x0660 U25 B1 0x0680 U26 B0 0x06A0 U27 B0 0x06C0 U26 B1 0x06E0 U27 B10x0700 U28 B0 0x0720 U29 B0 0x0740 U28 B1 0x0760 U29 B1 0x0780 U30 B00x07A0 U31 B0 0x07C0 U30 B1 0x07E0 U31 B1 0x0800 U32 B0 0x0820 U33 B00x0840 U32 B1 0x0860 U33 B1 0x0880 U34 B0 0x08A0 U35 B0 0x08C0 U34 B10x08E0 U35 B1 0x0900 U36 B0 0x0920 U37 B0 0x0940 U36 B1 0x0960 U37 B10x0980 U38 B0 0x09A0 U39 B0 0x09C0 U38 B1 0x09E0 U39 B1 0x0A00 U40 B00x0A20 U41 B0 0x0A40 U40 B1 0x0A60 U41 B1 0x0A80 U42 B0 0x0AA0 U43 B00x0AC0 U42 B1 0x0AE0 U43 B1 0x0B00 U44 B0 0x0B20 U45 B0 0x0B40 U44 B10x0B60 U45 B1 0x0B80 U46 B0 0x0BA0 U47 B0 0x0BC0 U46 B1 0x0BE0 U47 B10x0C00 U48 B0 0x0C20 U49 B0 0x0C40 U48 B1 0x0C60 U49 B1 0x0C80 U50 B00x0CA0 U51 B0 0x0CC0 U50 B1 0x0CE0 U51 B1 0x0D00 U52 B0 0x0D20 U53 B00x0D40 U52 B1 0x0D60 U53 B1 0x0D80 U54 B0 0x0DA0 U55 B0 0x0DC0 U54 B10x0DE0 U55 B1 0x0E00 U56 B0 0x0E20 U57 B0 0x0E40 U56 B1 0x0E60 U57 B10x0E80 U58 B0 0x0EA0 U59 B0 0x0EC0 U58 B1 0x0EE0 U59 B1 0x0F00 U60 B00x0F20 U61 B0 0x0F40 U60 B1 0x0F60 U61 B1 0x0F80 U62 B0 0x0FA0 U63 B00x0FC0 U62 B1 0x0FE0 U63 B1

The following table illustrates how the two-bit values are packed into32-bit words. Spread Factor 4 channels require more storage space thanis available in a single channel buffer. To allow for SF4 processing,the buffers for an even channel and the next highest odd channel arejoined together. The even channel performs the processing while the oddchannel is disabled.

Bit 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 Name b(0) b(1) b(2) b(3)  b(4)  b(5)  b(6)  b(7)  Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0Name b(8) b(9) b(10) b(11) b(12) b(13) b(14) b(15)

The beta*a-hat table contains the amplitude estimates for each fingerpre-multiplied by the value of Beta. The following table shows thememory mappings for each channel.

Offset User 0x0000 0 0x0080 1 0x0100 2 0x0180 3 0x0200 4 0x0280 5 0x03006 0x0380 7 0x0400 8 0x0480 9 0x0500 10 0x0580 11 0x0600 12 0x0680 130x0700 14 0x0780 15 0x0800 16 0x0880 17 0x0900 18 0x0980 19 0x0A00 200x0A80 21 0x0B00 22 0x0B80 23 0x0C00 24 0x0C80 25 0x0D00 26 0x0D80 270x0E00 28 0x0E80 29 0x0F00 30 0x0F80 31 0x1000 32 0x1080 33 0x1100 340x1180 35 0x1200 36 0x1280 37 0x1300 38 0x1380 39 0x1400 40 0x1480 410x1500 42 0x1580 43 0x1600 44 0x1680 45 0x1700 46 0x1780 47 0x1800 480x1880 49 0x1900 50 0x1980 51 0x1A00 52 0x1A80 53 0x1B00 54 0x1B80 550x1C00 56 0x1C80 57 0x1D00 58 0x1D80 59 0x1E00 60 0x1E80 61 0x1F00 620x1F80 63

The following table shows buffers that are distributed for each channel:

Offset User Buffer 0x00 0 0x20 1 0x40 2 0x80 3

The following table shows a memory mapping for individual fingers ofeach antenna.

Offset Finger Antenna 0x00 0 0 0x04 1 0x08 2 0x0C 3 0x10 0 1 0x14 1 0x182 0x1C 3

The y (soft decisions) table contains two buffers for each channel. Likethe b lookup table, an even and odd channel are bonded together toprocess SF4. Each y data value is stored as a byte. The data is writteninto the buffers as packed 32-bit words.

Offset Buffer 0x0000 U0 B0 0x0200 U1 B0 0x0400 U2 B1 0x0600 U3 B1 0x0800U0 B0 0x0A00 U1 B0 0x0C00 U2 B1 0x0E00 U3 B1 0x0000 U4 B0 0x0200 U5 B00x0400 U6 B1 0x0600 U7 B1 0x0800 U4 B0 0x0A00 U5 B0 0x0C00 U6 B1 0x0E00U7 B1 0x0000 U8 B0 0x0200 U9 B0 0x0400 U10 B1 0x0600 U11 B1 0x0800 U8 B00x0A00 U9 B0 0x0C00 U10 B1 0x0E00 U11 B1 0x0000 U12 B0 0x0200 U13 B00x0400 U14 B1 0x0600 U15 B1 0x0800 U12 B0 0x0A00 U13 B0 0x0C00 U14 B10x0E00 U15 B1 0x4000 U16 B0 0x4200 U17 B0 0x4400 U18 B1 0x4600 U19 B10x4800 U16 B0 0x4A00 U17 B0 0x4C00 U18 B1 0x4E00 U19 B1 0x5000 U20 B00x5200 U21 B0 0x5400 U22 B1 0x5600 U23 B1 0x5800 U20 B0 0x5A00 U21 B00x5C00 U22 B1 0x5E00 U23 B1 0x6000 U24 B0 0x6200 U25 B0 0x6400 U26 B10x6600 U27 B1 0x6800 U24 B0 0x6A00 U25 B0 0x6C00 U26 B1 0x6E00 U27 B10x7000 U28 B0 0x7200 U29 B0 0x7400 U30 B1 0x7600 U31 B1 0x7800 U28 B00x7A00 U29 B0 0x7C00 U30 B1 0x7E00 U31 B1 0x8000 U32 B0 0x8200 U33 B00x8400 U34 B1 0x8600 U35 B1 0x8800 U32 B0 0x8A00 U33 B0 0x8C00 U34 B10x8E00 U35 B1 0x9000 U36 B0 0x9200 U37 B0 0x9400 U38 B1 0x9600 U39 B10x9800 U36 B0 0x9A00 U37 B0 0x9C00 U38 B1 0x9E00 U39 B1 0xA000 U40 B00xA200 U41 B0 0xA400 U42 B1 0xA600 U43 B1 0xA800 U40 B0 0xAA00 U41 B00xAC00 U42 B1 0xAE00 U43 B1 0xB000 U44 B0 0xB200 U45 B0 0xB400 U46 B10xB600 U47 B1 0xB800 U44 B0 0xBA00 U45 B0 0xBC00 U46 B1 0xBE00 U47 B10xC000 U48 B0 0xC200 U49 B0 0xC400 U50 B1 0xC600 U51 B1 0xC800 U48 B00xCA00 U49 B0 0xCC00 U50 B1 0xCE00 U51 B1 0xD000 U52 B0 0xD200 U53 B00xD400 U54 B1 0xD600 U55 B1 0xD800 U52 B0 0xDA00 U53 B0 0xDC00 U54 B10xDE00 U55 B1 0xE000 U56 B0 0xE200 U57 B0 0xE400 U58 B1 0xE600 U59 B10xE800 U56 B0 0xEA00 U57 B0 0xEC00 U58 B1 0xEE00 U59 B1 0xF000 U60 B00xF200 U61 B0 0xF400 U62 B1 0xF600 U63 B1 0xF800 U60 B0 0xFA00 U61 B00xFC00 U62 B1 0xFE00 U63 B1

The sum of the a-hat squares is stored as a 16-bit value. The followingtable contains a memory address mapping for each channel.

Offset User 0x0000 0 0x0020 1 0x0040 2 0x0060 3 0x0080 4 0x00A0 5 0x00C06 0x00E0 7 0x0100 8 0x0120 9 0x0140 10 0x0160 11 0x0180 12 0x01A0 130x01C0 14 0x01E0 15 0x0200 16 0x0220 17 0x0240 18 0x0260 19 0x0280 200x02A0 21 0x02C0 22 0x02E0 23 0x0300 24 0x0320 25 0x0340 26 0x0360 270x0380 28 0x03A0 29 0x03C0 30 0x03E0 31 0x0400 32 0x0420 33 0x0440 340x0460 35 0x0480 36 0x04A0 37 0x04C0 38 0x04E0 39 0x0500 40 0x0520 410x0540 42 0x0560 43 0x0580 44 0x05A0 45 0x05C0 46 0x05E0 47 0x0600 480x0620 49 0x0640 50 0x0660 51 0x0680 52 0x06A0 53 0x06C0 54 0x06E0 550x0700 56 0x0720 57 0x0740 58 0x0760 59 0x0780 60 0x07A0 61 0x07C0 620x07E0 63

Within each buffer, the value for antenna 0 is stored at address offset0x0 with the value for antenna one stored at address offset 0x04. Thefollowing table demonstrates a mapping for each finger.

Offset User Buffer 0x00 0 0x08 1 0x10 2 0x1C 3

Each channel is provided a crossbar (e.g., RACEway™) route on the bus,and a base address for buffering output on a slot basis. Registers forcontrolling buffers are allocated as shown in the following two tables.External devices are blocked from writing to register addresses markedas reserved.

Offset User 0x0000 0 0x0020 1 0x0040 2 0x0060 3 0x0080 4 0x00A0 5 0x00C06 0x00E0 7 0x0100 8 0x0120 9 0x0140 10 0x0160 11 0x0180 12 0x01A0 130x01C0 14 0x01E0 15 0x0200 16 0x0220 17 0x0240 18 0x0260 19 0x0280 200x02A0 21 0x02C0 22 0x02E0 23 0x0300 24 0x0320 25 0x0340 26 0x0360 270x0380 28 0x03A0 29 0x03C0 30 0x03E0 31 0x0400 32 0x0420 33 0x0440 340x0460 35 0x0480 36 0x04A0 37 0x04C0 38 0x04E0 39 0x0500 40 0x0520 410x0540 42 0x0560 43 0x0580 44 0x05A0 45 0x05C0 46 0x05E0 47 0x0600 480x0620 49 0x0640 50 0x0660 51 0x0680 52 0x06A0 53 0x06C0 54 0x06E0 550x0700 56 0x0720 57 0x0740 58 0x0760 59 0x0780 60 0x07A0 61 0x07C0 620x07E0 63

Offset Entry 0x0000 Route to Channel Destination 0x0004 Base Address forBuffers 0x0008 Buffers 0x000C RESERVED 0x0010 RESERVED 0x0014 RESERVED0x0018 RESERVED 0x001C RESERVED

Slot buffer size is automatically determined by the channel spreadfactor. Buffers are used in round-robin fashion and all buffers for achannel must be arranged contiguously. The buffers control registerdetermines how many buffers are allocated for each channel. A setting of0 indicates one available buffer, a setting of 1 indicates two availablebuffers, and so on.

A further understanding of the operation of the illustrated and otherembodiments of the invention may be attained by reference to (i) U.S.Provisional Application Ser. No. 60/275,846 filed Mar. 14, 2001,entitled “Improved Wireless Communications Systems and Methods”; (ii)U.S. Provisional Application Ser. No. 60/289,600 filed May 7, 2001,entitled “Improved Wireless Communications Systems and Methods UsingLong-Code Multi-User Detection” and (iii) U.S. Provisional ApplicationSer. No. 60/295,060 filed Jun. 1, 2001 entitled “Improved WirelessCommunications Systems and Methods for a Communications Computer,” theteachings all of which are incorporated herein by reference, and a copyof the latter of which may be filed herewith.

The above embodiments are presented for illustrative purposes only.Those skilled in the art will appreciate that various modifications canbe made to these embodiments without departing from the scope of thepresent invention. For example, multiple summations can be utilized by asystem of the invention, and not separate summations as describedherein. Moreover, by way of further non-limiting example, it will beappreciated that although the terminology used above is largely based onthe UMTS CDMA protocols, that the methods and apparatus described hereinare equally applicable to DS/CDMA, CDMA2000 1X, CDMA2000 1xEV-DO, andother forms of CDMA.

1. In a spread spectrum communication system that processes one or morespread-spectrum waveforms (“user spread-spectrum waveforms”), eachrepresentative of a waveform associated with a respective user, theimprovement comprising: a first logic element which generates anestimated composite spread-spectrum waveform that is a function of oneor more of estimated complex amplitudes, estimated time lags, estimatedsymbols, and codes of the one or more user spread-spectrum waveforms,one or more second logic elements each coupled to the first logicelement, the one or more second logic elements generating a secondmatched-filter detection statistic for at least a selected user as afunction of a difference between a first matched-filter detectionstatistic for that user and an estimated matched-filter detectionstatistic for that user as a function of the estimated compositespread-spectrum waveform.
 2. The system of claim 1, wherein the one ormore second logic elements generate the second matched-filter detectionstatistic for at least the selected user as a function of a differencebetween (i) a sum of the first matched-filter detection statistic forthat user and a characteristic of an estimate of the selected user'sspread-spectrum waveform and (ii) the estimated matched-filter detectionstatistic for that user.
 3. The system of claim 2, wherein thecharacteristic is at least one of an estimated amplitude and anestimated symbol associated with the estimate of the selected user'sspread-spectrum waveform.
 4. The system of claim 1, wherein thespread-spectrum communications system is a code division multiple access(CDMA) base station.
 5. The system of claim 4, wherein the CDMA basestation comprises long-code receivers.
 6. The system of claim 1, whereinthe first logic element comprises arithmetic logic which generates anestimated composite spread waveform based on the relation${\rho^{(n)}\lbrack t\rbrack} = {\sum\limits_{k = 1}^{K_{v}}{\sum\limits_{p = 1}^{L}{\sum\limits_{r}{\delta\left\lbrack {{t - {\hat{\tau}}_{kp}^{(n)} - {{rN}_{c}{\rbrack \cdot {\hat{a}}_{kp}^{(n)} \cdot {c_{k}\lbrack r\rbrack} \cdot {{\hat{b}}_{k}^{(n)}\left\lbrack \left\lfloor {r/N_{k}} \right\rfloor \right\rbrack}}}},} \right.}}}}$wherein K_(v) is a number of simultaneous dedicated physical channelsfor all users, δ[t] is a discrete-time delta function, {circumflex over(α)}_(kp) ^((n)) is an estimated complex channel amplitude for thep^(th) multipath component for the k^(th) user, c_(k)[r] represents auser code comprising at least a scrambling code, an orthogonal variablespreading factor code, and a j factor associated with even numbereddedicated physical channels, {circumflex over (b)}_(k) ^((n))[m]represents a soft symbol estimate for the k^(th) user for the m^(th)symbol period, {circumflex over (τ)}_(kp) ^((n)) is an estimated timelag for the p^(th) multipath component for the k^(th) user, N_(k) is aspreading factor for the k^(th) user, t is a sample time index, L is anumber of multi-path components, N_(c) is a number of samples per chip,and n is an iteration count.
 7. The system of claim 6, wherein thearithmetic logic further generates the estimated compositespread-spectrum waveform based on the relation${{{\hat{r}}^{(n)}\lbrack t\rbrack} = {\sum\limits_{r}{{g\lbrack r\rbrack}{\rho^{(n)}\left\lbrack {t - r} \right\rbrack}}}},$wherein {circumflex over (r)}^((n))[t] represents the estimatedcomposite spread-spectrum waveform, and g[t] represents a pulse shape.8. The system of claim 1, wherein the estimated compositespread-spectrum waveform is pulse-shaped and is based on the userspread-spectrum waveform.
 9. The system of claim 1, wherein each secondlogic element comprises rake logic and summation logic which generatesthe second matched-filter detection statistic based on the relationy _(k) ^((n+1)) [m]=A _(k) ^((n)) ² ·{circumflex over (b)} _(k) ^((n))[m]+y _(k) ^((n)) [m]−y _(est,k) ^((n)) [m]; wherein A_(k) ^((n)) ²represents an amplitude statistic, {circumflex over (b)}_(k) ^((n))[m]represents a soft symbol estimate for the k^(th) user for the mth symbolperiod, y_(k) ^((n))[m] represents the first matched-filter detectionstatistic for the selected user, y_(est,k) ^((n))[m] represents theestimated matched-filter detection statistic for the selected user, andn is an iteration count.
 10. The system of claim 9, wherein the systemgenerates the second matched-filter detection statistic for the selecteduser and zero, one or more further second matched-filter detectionstatistics for that user iteratively.
 11. The system of claim 1, whereinthe first matched-filter detection statistic for at least the selecteduser is generated by a long-code receiver.
 12. The system of claim 1,wherein the logic elements are implemented on any of processors, fieldprogrammable gate arrays, array processors and co-processors, or anycombination thereof.
 13. A method for multiple user detection in aspread-spectrum communication system that processes long-codespread-spectrum user waveforms, the improvement comprising a method ofgenerating user matched-filter detection statistics for at least aselected user comprising: generating a composite spread-spectrumwaveform as a function of a pulsed-shaped composite re-spread waveform,generating a refined user matched-filter detection statistic for atleast the selected user that is a function of a difference between afirst matched-filter detection statistic for that user and an estimatedmatched-filter detection statistic for that user further comprisinggenerating the refined matched-filter detection statistic for at leastthe selected user as a function of a difference between (i) the sum ofthe first matched-filter detection statistic for that user and acharacteristic of an estimate of the selected user's spread-spectrumwaveform and (ii) the estimated matched-filter detection statistic forthat user.
 14. The method of claim 13, the further improvement whereinthe characteristic is at least one of an estimated amplitude, and anestimated symbol associated with an estimate of the selected user'sspread-spectrum waveform.
 15. A method for multiple user detection in aspread-spectrum communication system that processes long-codespread-spectrum user waveforms, the improvement comprising a method ofgenerating user matched-filter detection statistics for at least aselected user comprising: generating a composite spread-spectrumwaveform as a function of a pulsed-shaped commposite re-spread waveform,generating a refined user matched-filter detection statistic for atleast the selected user that is a function of a difference between afirst matched-filter detection statistic for that user and an estimatedmatched-filter detection statistic for that user wherein the step ofgenerating the second matched-filter detection statistic representativeof that user further comprises performing arithmetic logic based on therelationy _(k) ^((n+1)) [m]=A _(k) ^((n)) ² ·{circumflex over (b)} _(k) ^((n))[m]+y _(k) ^((n)) [m]−y _(est,k) ^((n)) [m], wherein A_(k) ^((n)) ²represents an amplitude statistic, {circumflex over (b)}_(k) ^((n))[m]represents a soft symbol estimate for the k^(th) user for the mth symbolperiod, y_(k) ^((n))[m] represents the first matched-filter detectionstatistic, y_(est,k) ^((n))[m] represents the estimated matched-filterdetection statistic, and n is an iteration count.
 16. The method ofclaim 15, wherein second matched-filter detection statistic is derivedfrom the estimated composite spread-spectrum waveform based on therelation${{y_{{est},k}^{(n)}\lbrack m\rbrack} \equiv {{Re}\left\{ {\sum\limits_{p = 1}^{L}{{{\hat{a}}_{kp}^{{(n)}^{H}} \cdot \frac{1}{2N_{k}}}{\sum\limits_{r = 0}^{N_{k} - 1}{{{\hat{r}}^{(n)}\left\lbrack {{rN}_{c} + {\hat{\tau}}_{kp}^{(n)} + {mT}_{k}} \right\rbrack} \cdot {c_{km}^{*}\lbrack r\rbrack}}}}} \right\}}},$wherein L is a number of multi-path components, â_(kp) ^((n)) is anestimated complex channel amplitude for the p^(th) multipath componentfor the k^(th) user, N_(k) is a spreading factor for the k^(th) user,{circumflex over (r)}^((n))[t] represents the estimated compositespread-spectrum waveform, N_(c) is a number of samples per chip, and{circumflex over (τ)}_(kp) ^((n)) is an estimated time lag for thep^(th) multipath component for the k^(th) user, m is a symbol period,T_(k) is a data bit duration, n is an iteration count and c_(km)[r]represents a user code comprising at least a scrambling code, anorthogonal variable spreading factor code, and a j factor associatedwith even numbered dedicated physical channels.